STAT 541: Robust Regression
Statistics 541: Robust Regression
Admistrivia
Read section 7.7
Robust Regression
What we normally do
look at the residuals and delete the outliers!
Alternative theory: use weighted least squares, and put zero weight on them
Rank regression
Let newY = rank of (Y)
Regress on newY
Oops non-normal errors
for large sample not a problem--if homoskadastic
Improved method: regress on normal score of newY
Cool! works as well as orginal
Lots of wonderful math (see Hajak and Zedek)
Non-standard
Works well if not much signal. In other words, when testing null a hypothesis.
M estimators
Recall influence function phi(x) (phi = pitchfork)
M-estimators normal equation: sum phi(error/sigma) x = 0
Huber example: phi(x) = x truncated at r
Identical to what we do by hand
Iteratively reweighted least squares
Instead of solving optimization, solve least squares
sum phi(error/sigma) x = 0 is close to sum w e x = 0 where w = phi(error/sigma)/e.
Totally silly!
But it works.
Example: L1 regression phi(x) = sign(x). But it is better to do LP.
Different than weighted least squares
Weighted least squares depends only on X matrix
Score functions depend on the outcomes Y
Both correct for large residuals
Last modified: Tue Mar 6 08:10:16 2001