Importance of statistics: Keeping people from overstating their
claims. Like a "blessing." So wider is better, less
significant is better since they are more likely to be
correct.
Randomization
The Ecological fallacy
Before the "blocks" project we did "find your own data" projects.
People found census data on states
Y = murders, X = education, Y positively related to X
Its negatively related if you add % urban. But its positively
related if you add %urban, and income.
Bummer--we should outlaw education instead of handguns!
Draw several pictures of how X could be related to Y
Observational studies
(AZ library project yet again) Suppose 15 out of 200 librarians
do the "new reading program."
X = 1 if new, 0 if old, Y = reading level of students
This is better than the state-by-state level analysis since at
least the covariates have to be within each subject
Start controling in a regression sense is the only way
Once we add "ordering books for new reading program" we will
get an infinate SE on our variable of X of interest
(This is as it should be!)
So people leave such variables out
Intutive analysis of a multiple regression
People look at a multiple regression and see all the choices
that are made
They say to themselves, couldn't they have done many other
regressions before they found this one?
So people intuitively do Bonferonni and ignore such studies
Hence people can't publish them and can't do them
Controlled experiments
By only doing Y vs X we eliminate the "intuitive Bonferonni
problems."
But, we have all the X --> Z --> Y like diagrams
So randomize assignment, now X is independent of Z FOR ALL Zs!
Ok, not all. We have anything that is post randomization could
be an interesting Z.
Hence don't tell the subjects (called single blind).
But merely telling say teachers/doctors that group A are
treated and group B are untreated is enought for group A to get
better results