(What to hand in: Write up about one paragraph describing what you
found. Then provide copies of the output that you generated to answer
the questions. Sloppy is fine on this assignment!)
First homework to be turned in: Use the web to find two
unrelated time series. By this I mean two variables
that are measured over time. For example, it might be
murders in the US and cell phones in Asia. Import both
of these into JMP and run a regression of one on the
other.
What is the t-statistics for the slope?
Do you believe it is significant?
Create a variable called "TIME".
Plot your residuals vs. TIME. Do you see a
pattern?
Your residuals probably are time dependent and hence not
independent.
Cures
(Note: this section requires class 4 from the FSW book.)
Lag your Y variable. (Use the formula editor to make a new
column called lag(y)).
Now do a multiple regression of Y on X and lag(Y).
Save the new residuals. Do your residuals still show
dependence on time? Has it at least been reduced.
(If that didn't fix the dependency, try making lag(2,Y) and
lag(3,Y) and dropping both of them into the regression.)
If you have removed the dependency, what is your t-statistic
for your effect of X on Y? Does this match your intuition better?
Another cure (optional)
If you have data that looks very smooth (like GDP or
population) then one possible cure is to difference the data.
So use the diff(Y) and diff(X) formulas to generate two new
columns.
Check that each row consists of the difference between pairs of
rows of the orginal column. When the difference is positive, Y
should increase.
Now do a regression of diff(Y) on diff(X). It should have the
"same slope" as the orginal relationship between X and Y. But
the residuals should look much better.
If the residuals look good, interpret the t-statistic for the
slope.