Last modified: Thu Sep 29 13:56:22 EDT 2005
by Dean Foster
Statistical Data mining: Bonferroni
Admistrivia
- Homework due today. More homework due next week
So: Add all of them with a |t| > sqrt(2 log p)
- Called Bonferroni
- Prove tail of the null: Phi(x) < phi(x) for x < -1.
- Prove sum of events: P(U Ai) &le sum P(Ai)
- First suggested by Foster and George (for regression) / Donoho
and Johnstone (for wavelets)
Started with "fit" ended with paranoia
- Back to first principles
- Goal is a good fit: Too poorly defined
- Goal is a good worst-case fit: Bad goal since encorages adding
all variables
- Goal is good fit over each size model: Idea of Risk Inflation
Risk inflation
- Definition: ratio of risks to best
- Alternative definition: compared to oracle
- Look at table 2 from paper to see how some simple methods do
dean@foster.net