STAT 541: Robust Regression
Statistics 541: Robust Regression
What we normally do
- look at the residuals and delete the outliers!
- Alternative theory: use weighted least squares, and put zero weight on them
- Let newY = rank of (Y)
- Regress on newY
- Oops non-normal errors
- for large sample not a problem--if homoskadastic
- Improved method: regress on normal score of newY
- Cool! works as well as orginal
- Lots of wonderful math (see Hajak and Zedek)
- Works well if not much signal. In other words, when testing null a hypothesis.
- Recall influence function phi(x) (phi = pitchfork)
- M-estimators normal equation: sum phi(error/sigma) x = 0
- Huber example: phi(x) = x truncated at r
- Identical to what we do by hand
Iteratively reweighted least squares
- Instead of solving optimization, solve least squares
- sum phi(error/sigma) x = 0 is close to sum w e x = 0 where w = phi(error/sigma)/e.
- Totally silly!
- But it works.
- Example: L1 regression phi(x) = sign(x). But it is better to do LP.
Different than weighted least squares
- Weighted least squares depends only on X matrix
- Score functions depend on the outcomes Y
- Both correct for large residuals
Last modified: Tue Mar 6 08:10:16 2001