Stat 112: Correlation, LS, Extrapolation
- Administrivia:
- Sample homework problems are listed on schedule along with
the reading. You should do both to keep from following
behind in the class.
- Measuring the degree of association
- Positive and negative LINEAR associations are measured by
correlation
- Perfectly postive = +1, perfectly negative = -1, no
association = 0 (see page 130 for pictures)
- Nice property: scale independent (try Y=chance of snow,
X=temp)
- Bad property: scale independent! (Y = MPG, X = engine size)
- Called r
- What if it isn't linear
- Tukey's "bulging rule"
- Draw a circle, divide into 4 pieces
- Upper left: sqrt X, log X, 1/X, Y2
- Upper right: X2 Y2
- Lower left: sqrt X, log X, 1/X, sqrt Y, log Y, 1/Y
- Lower right: X2, sqrt Y, log Y, 1/Y
- So transform data then do analysis
- Example: Boat price vs boat length
- Which line to use for a blob of points?
- Suppose X and Y aren't related (i.e. correlation = 0)
- Try both flat and 45 degree line--both seem to fit equally
well
- Think about prediction though. If you wanted a large Y,
would you be better off picking a small X or a large X?
- This is what the LS line tries to answer
- Observed = predicted + error. Draw a picture! (or see p 140)
- r = visual LS slope
- Equation for LS line
- Centercept: (X-bar, Y-bar)
- slope: b = r sY/sX
- predicted y = a + b x
- Extropolation
- Good life: small error in slope = big error in prediction
- Bad life: linear doesn't even hold
- Blood presure = 70 + age / 2
- What is blood presure at age 1000? (prediction = explosion,
truth = 0!)
- Use a computer!
- It is more important to get a computer to do the above for
you than to be able to do it in your head.
- Use JMP IN to do some of the homework.
Last modified: Thu Jan 20 09:14:25 EST 2000