STAT 541: Random Effects

# Statistics 541: Random Effects

## Homework:

In this assignment you will simulate some repeated measurements data and then estimate it to see how much of the information you can recover. The scientific study we are simulating is that each person has a sensitivity to sun. We will apply two different sunscreens to their nose and measure how much exposure there is. They will use one sunscreen for 10 days and then switch to the other sunscreen. We are interested in whether there is a difference between the treatment.

The data you will create then will consists of 20 measurements for 15 people (300 observations in all). Suppose we measure redness on a 10 point scale (1-10), with 1 being "winter time pale" and 10 being "burned to a bright red." We then observe Yij which is the observation taken at the end of each day, and Xij which is 0 for j = 1..10, and 1 for j = 11..20. The model is that

Yij = beta0 + beta1 Xij + epsilonij

1. Describe three different scientific stories as to how the epsilonij might be correlated. Be as concrete as possible--in other words, suppose you were trying to explain to someone who knows basic statistics (but not advanced statistics) why these shouldn't be analyzed as simply i.i.d. errors.
2. For each of your above three scientific stories, describe how you might model them probabilisticly or statisticial.
3. Simulate a model of the form:

Yij = beta0 + beta1 Xij + epsiloni + epsilonij

where all the epsilons are independent. Make various plots to see if your simluation makes sense. You will have to truncate the numbers to make the Y's land on the numbers 1 through 10. You will have to pick a resonable value for the variance so it isn't degenerate. You will have to think hard about how big a variance you should have on epsiloni since it describes individual effects. You will have to think about what are reasonable values of beta.

4. Do some exploritory data analysis to show that your model looks right. Make some histograms, etc.
Now that we have a data set, it is time to analyse it. The primary attribute of interest is beta1.
1. First run a two-sample t-test. In other words, just compare the average for those under treatment 0 with those under treatment 1. Run this test on your data. Does this test provide correct size? In other words, will 95% confidence intervals cover 95% of the time? (You might be able to check this by simply resimulating 20 times and see how often you cover the truth.)
2. Now run a "Tukey test." This means create an estimate of beta1 for each person. Now compute the one sample confidence interval based on these 15 numbers. Will a 95% confidence created in this fashion cover 95% of the time?
3. Use GEE (Generalized Estimating Equations from the paper I passed out) to estimate the standard error of the simple regression done in the first part. (Note: This one is impossible to do in JMP.)
4. Use a fixed effects model to deal with individual effects. In other words, include the subject as a variable. This will generate 15 estimates of the individual effects. Plot the actual effects vs. the estimated individual effects. Test if this is a 45 degree line as an "estimate" should be. Due to regression to the mean it won't in fact be a 45 degree line. Explain this.
5. Use a random effects model to recover both the individual effect and beta1.

If your software package does do this for you automatically, it can be done as follows. First estimate the fixed effects. Now estimate how accurately you estimate each of the fixed effects above. (JMP gives you standard errors.) Now estimate the variance of the estimated fixed effects (by computing a variance).

variance(estimated fixed effects) = var(epsiloni) + SE2

So we can now estimate var(epsiloni) which is also the covariance between the fixed effects estimates and the true random effects. From this we can now construct a regression equation for the epsiloni based on the estimated fixed effects.

6. Plot your estimate random effects vs the true random effects. Does it seem to be a better fit than the fixed effect plot?

• If off to see my grandma (she is doing much better now)

## Random intercept

### Basic model

• Simple random effect: Yi = Xi*beta + epsiloni + epsilonij
• Called random intercept
• sigma2 = variance(epsilonij), tau2 = var(epsiloni)
• Often interested in the distribution of epsiloni
• In that case, want mean and variance
• Obviously, mean can't be estimated, so assume it is zero
• Variance is interesting term
• Simpler: Yi = alpha + epsiloni + epsilonij
• Var(Yij) = Var(epsiloni) + Var(epsilonij)
• We can easilly estimate Var(Yij)
• We can easilly estimate Var(epsilonij)
• Hence: estimate Var(epsiloni) = Var(Yij) - Var(epsilonij)

### Standard errors

• alpha-hat = sum(Y)/(Kn) say
• var(alpha-hat) = (1/(Kn)2) var(sum(Y))
• var(alpha-hat) = (1/(Kn)2)(sum(var(Y)) + sum(covariance(Yij,Yi'j')))
• var(alpha-hat) = (1/(Kn)2)(Kn(sigma2+tau2) + sum(covariance(Yij,Yij')))
• var(alpha-hat) = (1/(Kn)2)(Kn(sigma2+tau2) + sum ni( ni-1)tau2)
• Notice: if inbalanced design, weighted least squares makes sense

### Fixed effects version

• Yi = Xi*beta + epsiloni + epsilonij
• sigma2 = variance(epsilonij)
• epsiloni are unknown parameters
• This is just a one-way ANOVA
• Obviously no estimate of tau since there isn't a tau in the model
• Estimate effects by epsiloni

### Estimating the random effects

• Plot estimates of effects vs true effect
• Notice: perfect regressions setup
• Compute means, variances, covariances
• But, we want the regression the other way around
• Still can be done

## What is a random effect and what is a parameter?

• Are you a Bayesian vs frequentists? Bayesians don't distinguish between the two.
• Are you interested in describing the distribution or the actual values?
• How many effects do you have? (if few, then random effects don't help very much)

## More complex random effects models

• random slopes (often intersted in average slope, say HGH)

Last modified: Thu Apr 12 08:53:02 2001