STAT 541: Finding Independence
Statistics 541: Finding Independence
- How to make prediction intervals in JMP (and possibly other
software): If you regress Y on X1,X2 and you want to predict for
X1=5,x2=10, then create Z1 = X1-5, Z2 = X2-10, and now regress Y on Z1
and Z2. The intercept of this regression is the point Z1=0, Z2=0,
which is the same point as X1=5, X2=10. In other words, assuming
you are doing a quadratic regression, the regression formula is:
logit(P(Y)) = alpha + beta1 Z1 + beta2 Z2 +
beta3 Z12 + ...
But sinse the Z1 and Z2 are both zero, this reduces to
logit(P(Y)) = alpha + beta1 0 + beta2 0 +
beta3 02 + ... = alpha
The confidence interval for this intercept is now the same as a
confidence interval for alpha.
Experiment: A study has been performed to determine how students
interact with professors. (Ok, it is currently being proformed here
at Penn and so we don't have any data yet.) Students are observed
raising their hands and data is recorded as to who gets called on.
1520's events are collected. The question of interest is whether male
professors call on male students more or less than female students and
likewise for female professors.
For each event, the following is recorded:
19 classes of data were collected with 80 observations taken in each
class. 10 male professors and 10 female professors. (But one female
professor used cold calling and so was eliminated from the study.)
How should this be analyzed?
- The sex of the professor
- number of males with their hands up
- number of females with their hands up
- who gets called on
- fraction of males in the class
- Regression vs. logistic regression
- Subsample times when there are the same number of males and
- What are the variables should we use? (Only sex of called on
and professor? Also who has hand up?)
- What do we want to test for? (intercept? interaction term?
- What is the sample size?
Naive logistic regression
- run logistic regression of called_on_sex on fraction_male_hands
for only the male professors.
- Look at prediction for .5 male hands. Is this "intercept"
- If we reject that it is zero, what have we proven?
- Each professor has their own pet that they call on?
- Males are biased?
- These males are biased?
- People sitting at the front of classes are called on
- The person called on is NOT a random draw from the
students who have their hands up
- Now suppose we compute an intercept for each of the 10
professors. That gives us 10 numbers. We can now test if they
are in fact different from zero. What have we proven?
- This is much closer to showing that males are biased.
Doesn't clustering hurt us statistically alot?
- Suppose we have n samples from 10 different hospitals.
- Suppose the are all truely IID normal(mu,sigma2).
- We should use X-bar = sum/(10n), with SE = sigma/sqrt(10n)
- But, suppose we worry that the data might not be
independent within each hospital
- So we compute X-bar1, X-bar2,...,
- Each under our assumption is normal(mu,sigma2/n)
- We then use classical statistics on these 10 numbers
- The average of the average is the same as X-bar before
- Its SE = (sigma/sqrt(n))/sqrt(10)
- The SAME SE as before!
- So this gratutious clustering hasn't hurt (very much).
Last modified: Thu Apr 5 08:31:09 2001