Statistics 541: Summary
- Always run a regression first--it helps you understand your
- Estimators need standard errors to be useful
- Identification of independence requires scientific knowledge
not statistical knowledge
- Design in orthogonality or suffer colinearity
- Use generalized linear models for efficiency (i.e. when
- This class pushed regression
- We can handle almost any difficulty that arises in regression
- Hence if you have trouble with your data--you can make sure
that you can deal with the primary problems (by using
- Anyone can write down an estimator that will "guess" the right
answer. (Method of moments, MLE, "it just feels right"
- But without a standard error these are practically useless:
- How to justify significance?
- How to do Bonferonni?
- How to create confidence intervals?
- How to tell if one estimator is more accurate than
- Cheap standard errors: Two independent estimates of the
- for example: One based on the future and the other based on the
past (this would have shorten an Econometrica paper of
mine by 30 pages)
- for example: histograms instead of kernal smoothed densities
- in expensive simulations: Run it twice
- If there isn't independence the SEs are wrong
- We can identify from the data:
- distributions of errors (normal, Cauchy, etc)
- covariance structure of repeated measurements
- complex patterns (say polynomials)
- It is impossible to identify independence
- The best we can do is look for some simple form of
- Say the simple forms found in time series
- So know the science behind your data to identify independence
- Note: bootstrapping won't help.
Orthogonality and randomization
- If the coefficients of the regression are actually important
orthoganlity/randomization is almost necessary for them to make
- Efficiency requires believing a model
- Some models are easy to believe (say binary data, or Poisson
data). These don't need the disclaimers below.
- Often times the more efficiency you have, the more sensitive
your estimators become to assumptions of your model being
- Tests based on ranks often avoid this problem
- If you have explored your data first using regression, you can
avoid this problem by hand
- If your robust estimator disagrees with efficient
estimator, use the robust one
- If your robust confidence interval is much wider than your efficient
one, life is good. Give both and let the readed decide if they
are willing to make the added assumption necessary to justify
the narrower window.
Last modified: Tue Apr 24 07:00:43 2001