**Exercise 2.2**

- The task here is to interpret the histogram rather to construct it.
**a.**- The average value falls in the middle of the histogram, where
the histogram balances. This particular histogram is close to
symmetric; with nearly equal left and right tails. It appears to us
that the histogram would balance somewhere in the interval between
842 and 847. Therefore the average value should be about 845.
Variability is a matter of how spread out the histogram is. In this case, there is a value in the 812 to 817 interval, and, on the other side, a value in the 867 to 872 interval. (Note in both intervals, the frequency is shown as 1, so there is only one value.) We'd say there was virtually no skewness.

**b.**- The smoothed component is shown as a curve through the histogram. Skewness is indicated by a nonsymmetric curve. In this case, the curve appears almost perfectly symmetric around the middle peak. There is a little or no skewness.

**Exercise 2.3**

- Recall that a stem-and-leaf display, like the one shown,
groups the data according to the values in the stem. The first value
shown must be 812 (as opposed to 81.2 or 8120), from a look at the
data.
The stem-and-leaf display gives interval width of 10, in contrast to the width of 5 in the histogram. In effect, the stem-and-leaf display centers the intervals at 815, 825, etc.; the centers for the histogram also differ. The two pictures aren't identical.

We see basically the same pattern in both displays. The average value is somewhere in the 840's, there is modest variability, and there is very little skewness.

**Exercise 2.11**

**a.**- The mode is defined to be the most common value, and is most often used to describe qualitative data. Here, the data are quantitative. The mode is not very useful for such data.
**b.**- The median is defined as the (
*n*+1)/2th value, when the data are arranged from lowest to highest. There are*n*=26 data values; the median is the (26+1)/2=13.5th value, the average of the 13th and 14th values. Arrange the data in order from low to high.547 625 630 656 664 667 667 667 679 688 688 688 688 688 691 694 697 699 700 701 702 703 703 703 708 711 **c.**- The mean is the average of all 26 numbers. We would
regard the data as a sample from the ongoing production process,
so we would call the mean .
Of course, it doesn't matter
if we had the original number, or the sorted numbers.

**d.**- Remember that the mean is pulled in the direction of skewness, as compared to the median. Here, the mean is less than the median, indicating that there is a ``tail'' of data toward the left ( smaller) values, pulling the mean down relative to the median.

**Exercise 2.12**

- Recall that the first digit(s) of each data value are
recorded in the the left-hand, stem part of the diagram, and the
next digit in the right hand, leaf part. It's easiest to use the
sorted data, but either way gives the same picture. The 547 value
is far lower than any other, so we might indicate it separately.
54 7 63 5 63 0 64 65 6 66 4 7 7 67 7 9 68 8 8 8 8 8 69 1 4 7 9 70 0 1 2 3 3 3 8 71 1

**Exercise 2.17**

**a.**- The mean is shown as 794.23; the standard deviation (STDEV)is shown as 34.25. Therefore, mean minus one standard deviation is 794.23-34.25=759.98 and mean plus one standard deviation is 794.23+34.25=828.48. The actual data are whole numbers, not decimals; values between 760 and 828 will fall in this interval.
**b.**- 51 out of 60 is 85; 51/60=0.85. According to the Empirical Rule, the percentage theoretically should be only . The one standard deviation interval is too wide in this case. It seems likely that skewness or outliers have inflated the standard deviation. This will make the interval ``too wide'' and capture ``too many'' of the data values.

**Exercise 2.18**

- Recall that outliers are shown in a boxplot as points beyond the ``whiskers'' of the plot. The boxplot shows several outliers, including one very serious one. These outliers will inflate the standard deviation, making the ``one standard deviation interval'' wide and causing the Empirical Rule to fail.

**Exercise 2.70**

**b.**- A histogram is shown here. The data
appear basically mound-shaped, with a modest right skew.
**c.**- The mode is at 65, which is a decent first guess for the mean. But there are more values higher than 65 than lower. These values will pull up the mean a little. The mean should be a bit above 65, say about 66.
**d.**- The range just about includes all the values. Therefore, two standard deviations be about 5, so one standard deviation should be about 2.5.
**e.**- JMP yielded
FOOD Minimum 61.22 Maximum 70.74 Mean 66.02377 Median 65.81 Standard Deviation 2.114983

**Exercise 2.71**

**a.**- A stem-and-leaf display of the data is as follows:
Decimal point is at the colon 12 : 4 12 : 99 13 : 00111222444 13 : 55556667778889999 14 : 0011112244 14 : 5678999 15 : 0 15 : 555 16 : 1

We would call that more or less bell-shaped, with some right skew. **e.**- JMP yielded
NONFOOD Minimum 12.38 Maximum 16.15 Mean 13.94585 Median 13.84 Standard Deviation 0.7710269

**Exercise 2.72**

**a.**- Here is a boxplot.
**b.**- There is one candidate outlier on the low side, at about 0.805 or so. There are no outliers shown above the upper whisker.
**c.**- JMP calculated summary statistics as shown here.
RATIO Minimum 0.8060 Maximum 0.8434 Mean 0.8256 Median 0.8254 Standard Deviation 0.007367554

Is it true that 0.8256=66.02377/(66.02377+13.94585)? By hand, the fraction comes out to 0.8256, all right. In fact, it isn't true in general that the mean of a ratio is the ratio of means.