Consider the following data on cleaning crews (cleaning.jmp). We expect that the number of
rooms cleaned will be linear in the number of crews sent to clean
them. In fact, we might even expect that if we send zero crews, we
will get zero rooms cleaned. We want Y to be the number of rooms
cleaned and X to be the number of crews sent out. How many rooms does
each crew clean?
Plot the data, run a simple regression and create prediction
bounds. Does the data appear homoskedastic?
Transform the data to do a weighted least squares. Try using
both a standard deviation proportional to X and portional to
squareroot of X. Plot both. Which appears to be more
homoskedastic?
Use the White estimator (the sandwich estimator) to generate
standard errors for the slope and intercept.
Discussion question: (Please type up a one page answer to the
following.) Compare the confidence intervals for the slope in
each of the methods above. Which ones do you believe? Are the
ones that are theoretically wrong qualitatively wrong?
Our theory suggests that the intercept should be zero. Which
is the correct test to run? Do we fail to reject the null? Do
any of the other test incorrectly reject the null?
Pick the weighted least squares model that appears to be the
most homoskedastic. Now use the White estimator on that
model. Does it change the SE's very much?
The envelope please: Add up all the rooms cleaned. Add up
all the crews. Divide these two to come up with an average
number of rooms cleaned per crew. This should match one of the
slopes you computed above. (You could also compute a standard
error by hand to see which confidence intervals above are the
closest to describing the right answer. But you don't have to
do this.)
Leverage
How much do betas depend on a single point? (this leads to p x
n matrix of leverages)
How much do predictions depend on a single point? (this leads a
a vector of leverages)
How much does prediction of i depend on value of i?
prediction is Xi beta-hat.
beta-hat = (X'X)-1X'Y
prediction is Xi (X'X)-1X'Y
To convert Y to a prediction: (X (X'X)-1X')Y
hat matrix: h = (X (X'X)-1X')
Called projection matrix. Or "hat matrix" since it puts
a hat on y.
hii is dy-hat/dy--called leverate
Nice property: depends ONLY on X's. So if you decide to toss a
point out based on its leverage, you aren't biasing your results.
Use mahalobious to "see" leverage
Influence = leverage x outlier
A point that isn't leveraged doesn't effect outcome very much
a point that isn't an outlier doesn't effect outcome very much
need both to be influential
draw pictures of various outliers: MBA call them, cottages,
direct mail, and crime)
Various definitions of influence
DFFITS = change in forecast i if you leave out observation i
(y-hati - yi,-i)/SE(y-hat)
R-student sqrt(h/1-h)
looking at squared values, and noting that h is
approximately zero leads to influence = leverage x
outlier concept
DFBETAS = change if slope j if you leave out obseration i
no easy formula
no longer related to leverage (hatii)
depends on scale of beta--change scale of X changes
value of DFBETAS
DFALL = Cooks D = change in all betas = change in all
predictions
use "natural" parameter space/prediction loss
Cooks D is squared-distance change in adding point i
tells how much parameters in natural basis
tells how much all the predictions change on average