- How to make prediction intervals in JMP (and possibly other
software): If you regress Y on X1,X2 and you want to predict for
X1=5,x2=10, then create Z1 = X1-5, Z2 = X2-10, and now regress Y on Z1
and Z2. The intercept of this regression is the point Z1=0, Z2=0,
which is the same point as X1=5, X2=10. In other words, assuming
you are doing a quadratic regression, the regression formula is:
logit(P(Y)) = alpha + beta

_{1}Z1 + beta_{2}Z2 + beta_{3}Z1^{2}+ ...But sinse the Z1 and Z2 are both zero, this reduces to

logit(P(Y)) = alpha + beta

_{1}0 + beta_{2}0 + beta_{3}0^{2}+ ... = alphaThe confidence interval for this intercept is now the same as a confidence interval for alpha.

For each event, the following is recorded:

- The sex of the professor
- number of males with their hands up
- number of females with their hands up
- who gets called on
- fraction of males in the class

- Regression vs. logistic regression
- Subsample times when there are the same number of males and females?
- What are the variables should we use? (Only sex of called on and professor? Also who has hand up?)
- What do we want to test for? (intercept? interaction term? some slope?)
- What is the sample size?

- run logistic regression of called_on_sex on fraction_male_hands for only the male professors.
- Look at prediction for .5 male hands. Is this "intercept" zero?
- If we reject that it is zero, what have we proven?
- Each professor has their own pet that they call on?
- Males are biased?
- These males are biased?
- People sitting at the front of classes are called on more often?
- The person called on is NOT a random draw from the students who have their hands up

- Now suppose we compute an intercept for each of the 10 professors. That gives us 10 numbers. We can now test if they are in fact different from zero. What have we proven?
- This is much closer to showing that males are biased.

- Suppose we have n samples from 10 different hospitals.
- Suppose the are all truely IID normal(mu,sigma
^{2}). - We should use X-bar = sum/(10n), with SE = sigma/sqrt(10n)
- But, suppose we worry that the data might not be independent within each hospital
- So we compute X-bar
_{1}, X-bar_{2},..., X-bar_{10}. - Each under our assumption is normal(mu,sigma
^{2}/n) - We then use classical statistics on these 10 numbers
- The average of the average is the same as X-bar before
- Its SE = (sigma/sqrt(n))/sqrt(10)
- The SAME SE as before!
- So this gratutious clustering hasn't hurt (very much).

Last modified: Thu Apr 5 08:31:09 2001

*
*