Last modified: Thu Sep 29 13:56:22 EDT 2005
by Dean Foster

# Statistical Data mining: Bonferroni

## Admistrivia

- Homework due today. More homework due next week

### So: Add all of them with a |t| > sqrt(2 log p)

- Called Bonferroni
- Prove tail of the null: Phi(x) < phi(x) for x < -1.
- Prove sum of events: P(U A
_{i}) &le sum P(A_{i})

- First suggested by Foster and George (for regression) / Donoho
and Johnstone (for wavelets)

## Started with "fit" ended with paranoia

- Back to first principles
- Goal is a good fit: Too poorly defined
- Goal is a good worst-case fit: Bad goal since encorages adding
all variables
- Goal is good fit over each size model: Idea of Risk Inflation

## Risk inflation

- Definition: ratio of risks to best
- Alternative definition: compared to oracle
- Look at table 2 from paper to see how some simple methods do

dean@foster.net