STAT 541: Robust Regression

# Statistics 541: Robust Regression

## Robust Regression

### What we normally do

• look at the residuals and delete the outliers!
• Alternative theory: use weighted least squares, and put zero weight on them

### Rank regression

• Let newY = rank of (Y)
• Regress on newY
• Oops non-normal errors
• for large sample not a problem--if homoskadastic
• Improved method: regress on normal score of newY
• Cool! works as well as orginal
• Lots of wonderful math (see Hajak and Zedek)
• Non-standard
• Works well if not much signal. In other words, when testing null a hypothesis.

### M estimators

• Recall influence function phi(x) (phi = pitchfork)
• M-estimators normal equation: sum phi(error/sigma) x = 0
• Huber example: phi(x) = x truncated at r
• Identical to what we do by hand

### Iteratively reweighted least squares

• Instead of solving optimization, solve least squares
• sum phi(error/sigma) x = 0 is close to sum w e x = 0 where w = phi(error/sigma)/e.
• Totally silly!
• But it works.
• Example: L1 regression phi(x) = sign(x). But it is better to do LP.

### Different than weighted least squares

• Weighted least squares depends only on X matrix
• Score functions depend on the outcomes Y
• Both correct for large residuals