Influential observations

This applet allows you to remove individual observations and note the effect on the slope and intercept of the regression line. This is illustrated in the following display, using the data on Alcohol and Tobacco consumption in Britain that we saw in Table 9.3.

The purpose of the data in Table 9.3 was to illustrate the dramatic influence that a single data point can have on both the correlation and the coefficients of the regression line. Begin by clicking on any point to remove it from the data. Note that both the new and the old regression line will appear. You can restore that point either by clicking on it again, or by clicking on another point. If you choose the second alternative, the first point will be restored and the one you clicked on will be removed.

The obvious point for removal is the one from Northern Ireland in the lower right corner. But see what happens when you remove other points. Begin by removing points close to the line, and then more extreme points.

What happens when you take two points that are approximately equal distances from the regression line, but one is further from the mean of X than another?

The following display contains the data on cancer and solar radiation, shown in Figure 9.3. (I have divided radiation levels by 100 for convenience.) This display gives you more points, and less extreme points, to play with than the previous display. Notice how the increase in sample size reduces the degree to which any one point influences the regression line.

Comments to: Gary.McClelland@Colorado.e du


back arrow Return to index