STAT 541: More Logistic Regression
  Statistics 541:  Logistic Regression 3 
Admistrivia
-  Homework:
     
       -  I typed in table 7.7 on pages
	    330 - 332 from Myers.   Load it into your favorite
	    software for analysis.  (You might scan it for errors!)
       
-  In spite of the fact that we can't actually look at any
	    good plots, we still need to check assumptions.
	    We don't have to worry about hetroskadasticity since we
	    KNOW the variance.  But we have to worry about curvature.
	    So add varables corresponding to vol2 and
	    rate2.  Are the quadratic terms significantly
	    better? 
       
-  Maybe some other transformation of the X's makes sense?
	    Instead of using a linear term for volumn use a 4th degree
	    polynomial.  Does it fit signficantly better than the
	    linear fit?  (In other words, you want to have (vol, rate,
	    vol2, vol3, vol4) as X's
	    in your regression.  THen test if 
	    vol2, vol3, vol4 are
	    collectively all zero.)
       
-  Suppose we are interested in predicting the point vol=5,
	    rate=3.  First describe a measure of how much of an
	    extrapolation this is (Mahalanobis would be one possible
	    measure for this.)  Now make prediction intervals using
	    your linear model and your quadratic model for this point.
	    Which prediction would you prefer to champion?
       
-  Consider the point (ave(vol),ave(rate)), in other
	    words, the center of the data.   What is the Mahalanobis
	    measure for this point?  Make predictions using both the
	    linear and the quadratic model.  Does it matter which
	    interval you use?
       
-  Type up a paragraph saying what model do you think is best
	    for fitting this data.  You don't have to restrict
	    yourself to the models listed above--you can try other
	    transformations.  For example, you might consider the
	    interaction (vol times rate) in the presence of the
	    quadratic terms (namely  vol, rate, vol2,
	    rate2, vol*rate).  This is called a quadradic
	    surface. 
     
 
General modeling concepts
Suppose one believes Y is a monotone function of X.
  -  Logistic gives one particular form.
  
-  Adding polynomials will possibly fix it.
  
-  But has strange modeling assumptions for large values of X.
  
-  Either goes to 1, goes to zero or goes to "Unknown."
Use trimmed X's to fix this problem.  So regress on both X and an X
truncated at say the 95% point of the data.
  -  The regression doesn't know which X to extrapolate with
  
-  So it will give wide intervals that match the "last good part
       of the data."
  
-  If we now do trimmed polynomials--we avoid extrapolation AND
       can fit any function
  
-  Unfortunately, most software will break due to colinearity
       problems.  Oh well.
Computing standard errors via likelihood methods
An advantage of estimators that are linear combinations of Y's is that
we can figure out SE's via a central limit theorem.  This was the
approach in least squares regression.  (Beta-hat =
(X'X)-1X'Y = wY for some weight w.)
We have two approaches.  We can simply use the weights given by the
last round of the IRLS, or we can use a likelihood based method.
Likelihood method for standard regression:
  -  Consider X-bar = Normal(mu,sigma2/n)
  
-  We compute t = X-bar*sqrt(n)/sigma = Normal(mu*sqrt(n)/sigma,1)
  
-  likelihood under null takes t =  Normal(0,1)
  
-  likelihood under mu-hat  t =  Normal(xbar*sqrt(n)/sigma,1)
  
-  So the likelihood ratio is: exp(-t2/2)
  
-  So the log likelihood ratio is: -t2/2
  
-  So a p-value =  .05 implies t = 2, implies lambda = -2
  
-  So intuition is a likelihood difference of 2 is about p-value
       of .05
  
-  In general, (- 2 *log likelihood) has about a chi-squared
       distribution
Likelihood method for logistic regression:
  -  Compute the likelihood ratio statistic: - 2 log likilhood
  
-  Reject at .05 if bigger than 4
Chi-square tests
What if Y is discrete and X is discrete also?
  -  EG: Y regressed on X where X takes on value A/a and Y
       takes on values B/b.
  
-  What is the model?
       
	 -  P(B|A)/P(b|A) = exp(alpha)
	 
-  P(B|a)/P(b|a) = exp(alpha + beta)
	 
-  Independence iff P(B|A)/P(b|A) = P(B|a)/P(b|a)
	 
-  I.e. independence iff beta = 0
       
 
-  Typical test is chi-squared test.  Gives almost same
       answer--but chi-square is an approximation
  
-  Permutation tests also available
  
Last modified: Tue Mar 27 08:57:19 2001