STAT 541: JMP and L and M estimators

# Statistics 541: JMP and L and M estimators

## Admistrivia

- 1st homework:
- Find a data set with between 5 and 20 columns of
comparable information.
- Read the data into JMP and compute the statistics computed
on page 10 of the handout. Look at comparison box plots
for all your data. Be sure to put them on the same scale!
- Read the data into Splus and compute the statistics for
all the columns in your dataset. (See Splus code on the
pages between 17 and 18.) The idea is to do this somewhat
automatically. Either by creating a function. Or by
using your editor to make a command list. You should be
able to re-generate your output after a small change is
made in the data without having to click lots and lots of
buttons.
- Comment on what you found out scientifically. (I.e. which
column has the highest data, and why is this of interest?)

## JMP

Look at repairs.jmp.
## Mathematics of robustness

Goal is to provide framework to discuss properties of estimators.

### L estimators

- Ranking the data (called Y
_{(1)}, ...
- L estimator is sum of weights of ranked data
- easy example: sample average is an L estimator
- harder example: trimmed mean is a L estimator (trim everything
and you have the median)
- IQR is an L estimate of scale

### M estimators

- objective function = sum rho(y
_{i})
- phi is derivative of rho
- Newton says the solution is: sum phi = 0
- Example: rho = x
^{2}, phi = 2x, solution is average
- Example: rho = |x| phi = sign(x), solution is median

### Invariance

Shift invariance: T(y + c) = T(y) + c
scale invariance: T(ay) = aT(y)

Obvious for L esitimators. Not always true for M estimators.

### Clasic m-estimator

The clasic m-estimator is the biweight;
phi(u) = u(1 - u^{2})^{2} (for |u| < 1)

### Sensitivity and influence

sensitivity is the effect of adding one observation to a small
dataset.
Influence curve is n times the effect of adding one observation to a
large number of observations.

Claim: The influence curve is proportional to the phi-function

Properties of estimators

- efficiency
- breakdown (how much bad data can be included without
arbitarilly blowing up the estimator)
- gross error sensitivity (max of influence curve)

*
*
Last modified: Tue Jan 23 08:50:00 2001

*
*