STAT 541: More Logistic Regression

# Statistics 541: Logistic Regression 3

• Homework:
• I typed in table 7.7 on pages 330 - 332 from Myers. Load it into your favorite software for analysis. (You might scan it for errors!)
• In spite of the fact that we can't actually look at any good plots, we still need to check assumptions. We don't have to worry about hetroskadasticity since we KNOW the variance. But we have to worry about curvature. So add varables corresponding to vol2 and rate2. Are the quadratic terms significantly better?
• Maybe some other transformation of the X's makes sense? Instead of using a linear term for volumn use a 4th degree polynomial. Does it fit signficantly better than the linear fit? (In other words, you want to have (vol, rate, vol2, vol3, vol4) as X's in your regression. THen test if vol2, vol3, vol4 are collectively all zero.)
• Suppose we are interested in predicting the point vol=5, rate=3. First describe a measure of how much of an extrapolation this is (Mahalanobis would be one possible measure for this.) Now make prediction intervals using your linear model and your quadratic model for this point. Which prediction would you prefer to champion?
• Consider the point (ave(vol),ave(rate)), in other words, the center of the data. What is the Mahalanobis measure for this point? Make predictions using both the linear and the quadratic model. Does it matter which interval you use?
• Type up a paragraph saying what model do you think is best for fitting this data. You don't have to restrict yourself to the models listed above--you can try other transformations. For example, you might consider the interaction (vol times rate) in the presence of the quadratic terms (namely vol, rate, vol2, rate2, vol*rate). This is called a quadradic surface.

## General modeling concepts

Suppose one believes Y is a monotone function of X.
• Logistic gives one particular form.
• Adding polynomials will possibly fix it.
• But has strange modeling assumptions for large values of X.
• Either goes to 1, goes to zero or goes to "Unknown."
Use trimmed X's to fix this problem. So regress on both X and an X truncated at say the 95% point of the data.
• The regression doesn't know which X to extrapolate with
• So it will give wide intervals that match the "last good part of the data."
• If we now do trimmed polynomials--we avoid extrapolation AND can fit any function
• Unfortunately, most software will break due to colinearity problems. Oh well.

## Computing standard errors via likelihood methods

An advantage of estimators that are linear combinations of Y's is that we can figure out SE's via a central limit theorem. This was the approach in least squares regression. (Beta-hat = (X'X)-1X'Y = wY for some weight w.)

We have two approaches. We can simply use the weights given by the last round of the IRLS, or we can use a likelihood based method.

Likelihood method for standard regression:

• Consider X-bar = Normal(mu,sigma2/n)
• We compute t = X-bar*sqrt(n)/sigma = Normal(mu*sqrt(n)/sigma,1)
• likelihood under null takes t = Normal(0,1)
• likelihood under mu-hat t = Normal(xbar*sqrt(n)/sigma,1)
• So the likelihood ratio is: exp(-t2/2)
• So the log likelihood ratio is: -t2/2
• So a p-value = .05 implies t = 2, implies lambda = -2
• So intuition is a likelihood difference of 2 is about p-value of .05
• In general, (- 2 *log likelihood) has about a chi-squared distribution
Likelihood method for logistic regression:
• Compute the likelihood ratio statistic: - 2 log likilhood
• Reject at .05 if bigger than 4

## Chi-square tests

What if Y is discrete and X is discrete also?
• EG: Y regressed on X where X takes on value A/a and Y takes on values B/b.
• What is the model?
• P(B|A)/P(b|A) = exp(alpha)
• P(B|a)/P(b|a) = exp(alpha + beta)
• Independence iff P(B|A)/P(b|A) = P(B|a)/P(b|a)
• I.e. independence iff beta = 0
• Typical test is chi-squared test. Gives almost same answer--but chi-square is an approximation
• Permutation tests also available