- lots of data from x=1..3, little data for x=4..6, but interested in x=4..6
- Should we use all the data? Or just the data from 4..6?
- Using all the data assumes the model holds everwhere
- Borrows strength. Generates a good null distribution.
- Maybe then: extropolate 1..3 to generate y-hat-linear. Then regression on Y-y-hat-linear. Advantage, uses all data, but doesn't use it very much.
- Bad idea: simply use all the data. Basically the same as only using the data from 1..3

- Point made by George Easton: robust estimators should be evaluated with robust losses
- Thinking about how you would validate an estimator helps focus your mind on what you are trying to accomplish
- Efficiency says to fit based on statistical loss, not economic loss: This requires the model to be correc so it is taking a risk and isn't robust
- Often a good idea to fit based on the criterion you are going to evalute with--this is a robust technique
- Lots of fun research on it. (see calibration and no-regret) Fun at least to me!

- the problem:
- forecast bankruptcies based on things credit card companies know.
- not an economic model
- prediction is goal and not estimating the parameters
- millions of person-months of observations
- 1000s of bankruptcies
- 100 basic variables --> 67000 interactions, and dummies for missing values

- Economic loss
- Ideally, classification, with abolute error loss
- Most interested in classifying people close to 5% chance of bankruptcy than people close to .001% chance.
- closer to squared error than to weighted error
- Most people don't go bankrupt
- Most (as in 90%) people have a forecast of .001 or less
- weighting the heavily would lead to an extrapolation error

- Our criterion then is quadratic loss, better would be weighted quadratic loss weighting by importance of the person.

- Searching for independence
- repeated measurements on each person
- GEE is right answer.
- we took only one obseravtion per person as our cheat

- Impressive graph: show lift chart page 7
- Creation of the 67000 variables (dummies, interactions)
- White estimator for hetroskadasticity
- Variable selection (Bonferonni = 2 log p, we used 2 log p/q)
- Page 31 graph is in sample
- page 32 is out of sample

Last modified: Tue Apr 24 09:00:27 2001

*
*