
1. The following example is taken from the Data and Story Library at Carnegie Melon University. The original source was: Occupational Mortality: The Registrar General's Decennial Supplement for England and Wales, 1970-1972, Her Majesty's Stationery Office, London, 1978.
The data summarize a study of men in 25 occupational groups in England. Two indices are presented for each occupational group. The smoking index is the ratio of the average number of cigarettes smoked per day by men in the particular occupational group divided by the average number of cigarettes smoked per day by all men. (Values greater than 100 are above average.) The mortality index is the ratio of the rate of deaths from lung cancer among men in the particular occupational group divided by the rate of deaths from lung cancer among all men.
Number of cases: 25
Variable Names:
1.Occupational_Group: Occupational Group
2.Mortality: Lung cancer mortality index (100 = average)
3.Smoking: Smoking index (100 = average)
Data:
| Occupational Group | Mortality |
Smoking |
| Farmers, foresters, and fisherman | 77 |
84 |
| Miners and quarrymen | 137 |
116 |
| Gas, coke and chemical makers | 117 |
123 |
| Glass and ceramics makers | 94 |
128 |
| Furnace, forge, foundry, and rolling mill workers | 116 |
155 |
| Electrical and electronics workers | 102 |
101 |
| Engineering and allied trades | 111 |
118 |
| Woodworkers | 93 |
113 |
| Leather workers | 88 |
104 |
| Textile workers | 102 |
88 |
| Clothing workers | 91 |
104 |
| Food, drink, and tobacco workers | 104 |
129 |
| Paper and printing workers | 107 |
86 |
| Makers of other products | 112 |
96 |
| Construction workers | 113 |
144 |
| Painters and decorators | 110 |
139 |
| Drivers of stationary engines, cranes, etc. | 125 |
113 |
| Laborers not included elsewhere | 133 |
146 |
| Transport and communications workers | 115 |
128 |
| Warehousemen, storekeepers, packers, and bottlers | 105 |
115 |
| Clerical workers | 87 |
79 |
| Sales workers | 91 |
85 |
| Service, sport, and recreation workers | 100 |
120 |
| Administrators and managers | 76 |
60 |
| Professionals, technical workers, and artists | 66 |
51 |
a. Use SPSS to draw a scatterplot of the data. (
Add a column to the data with occupational title. Be sure to specify that it is a string variable.) TDouble click on the figure then and select Chart/Options and tell it to fit a least-squares line, and choosing "display r2 " from the Fit Options box.a. Obtain the correlation and the regression equation for predicting Mortality from Smoking using the regression procedure. Demonstrate to yourself that you could get the same correlation from the correlations procedure.
b. What does r2 tell you about this relationship?
c. Interpret the analysis of variance summary table.
d. Write out the regression equation and use a calculator to predict the Mortality score for the first two subjects. (This information can be found on the regression procedure, but not from the correlation procedure.)
e. Comment on the statistical significance (or lack thereof) for the coefficients.
f. Calculate the residuals for the first two occupational groups.
.1. Do this by hand, and also let SPSS do it. The way that you get SPSS to calculate predicted values and residuals is to click on the Save box in the Regression dialog box, and then click on the appropriate squares. These new scores will be added to the data file
2. Describe what we mean when we say that the residual variable that you just created is that part of the Mortality score independent of smoking behavior.
3. Run the appropriate regression to illustrate the independence that you just dealt with the previous question.
4. Answer the following questions about the design of the study:
2. Second Example
Katz et al. (1990) examined the performance of students who answered SAT reading comprehension questions with, or without, looking at the passage to which the questions referred. They found that those who did not read the passage did worse than those who did, which no be no big surprise, but that those who did not read it still scored appreciably better than chance. One questions was whether the data on their test reflected performance on the SAT. The authors therefore correlated those results for students who did not read the passages with the student's own SAT scores when they applied to college.
The data are:
| Test | 58 | 48 | 48 | 41 | 34 | 43 | 38 | 53 | 41 | 60 |
| SAT | 590 | 590 | 580 | 490 | 550 | 580 | 550 | 700 | 560 | 690 |
| Test | 55 | 44 | 43 | 49 | 47 | 33 | 47 | 40 | 46 | 53 |
| SAT | 800 | 600 | 650 | 580 | 660 | 590 | 600 | 540 | 610 | 580 |
| Test | 40 | 45 | 39 | 47 | 50 | 53 | 46 | 53 | ||
| SAT | 620 | 600 | 560 | 560 | 570 | 630 | 510 | 620 |
a. Why would we care about the correlation between performance and the SAT?
b. Using SPSS, plot the variables individually and then draw a scatter plot.
c. Calculate the correlation and the regression line.
1. Interpret the correlation coefficient and tell what this suggests about the experiment. Is it significant, and, if so, what does that tell us?
2. Interpret the slope and intercept appropriately.
3. Are the slope and intercept significant, and how do you know?
d. When Katz et al. (1990) looked at students who had read the passages before answering the questions, the correlation was .68, with a sample size of N = 17.
1. How would you compare those two correlations?
2. Compare the two correlations and interpret your results.
![]()
The correlation was .691 for the first part and .532 for the second.
The corresponding Fisher transforms are .850 and .592.
e. What, if anything, does it mean to have a difference between the two groups on the mean comprehension score as far as correlation and regression are concerned?
f. Write up the results of this experiment as if these were your data and you were submitting the results for publication. (That means that you cant just write a sentence or two. You have to say what you did, what you found, and what it means.) Include in this discussion something to indicate that you understand the interpretation of r2.
Last revised: 11/03/01