Last modified: Thu Nov  3 14:11:17 EST 2005
by Dean Foster
Statistical Data mining: Streaming searching
Admistrivia
Recall definitions of V/R/S/T/etc
pass out handout
FDR: False Discovery Rate
   -  FDR = E(V/R)
   
-  FDR < alpha is target
   
-  Simes procedure will control FDR
EDC: Excess discovery count
   -  EDC = E(S - gamma R) + alpha
   
-  EDC > 0 is target
   
-  alpha controls FWER(0)
   
-  gamma controls FDR at rate about 1-gamma
Theorem: alpha investing controls EDC
Simes proceedure for FDR
   -  Does alpha investing control FDR? Unknown.
   
-  What does control FDR? Simes does.
       
       -  order the p-values
       
-  Try first at alpha/m, second at 2 alpha/m, etc
       
-  Once you fail to reject, stop.
       
 
-  If independent tests it is "easy" to prove controls the FDR.
(Draw picture.)
   
-  In worse case, it is "easy" to you lose a log(m) factor
dean@foster.net