The following is copied directly from Utts, J. M. (1996) Seeing Through Statistics, Belmont, CA:Duxbury Press. (This is an excellent book dealing with the interpretation of data, especially though the use of published reports.) The selection deals with the issue of salary differences between males and females, and what it means for a difference to be significant or not significant. Or, more correctly, can a difference be important even when it is not significant.
It seems more than a little strange to say "I don't have enough data to be able to conclude that anything is going on, but what is going in is certainly important." On the other hand, we know from discussions of statistical power that not every experiment is going to find a statistically significant result, even if there is a large difference between population means. When it comes to multiple comparisons, we know that not all true differences are going to produce a significant result.
The issue involved is one that has probably come up on every university campus, and in many businesses as well. The debate that follows invariably plays itself out over and over again.
I think that the telling line in the following selection is the one that says "However, for some of the subject matter groups, the difference found was not statistically significant." This tells me that it was significantly different for at least some groups, and the way it was written I assume that this is true for more than one group. That, in turn, tells me that we are talking about an issue of power. If the null is false in each case, and power = .50, we would expect that half of the comparisons would not find a significant difference. Even at a very large university, some subject matter groups are going to have relatively few faculty, and for those cases power is almost certain to be well below .50.
"A number of universities have tried to determine whether or not male and female faculty members with equivalent seniority earn equivalent salaries. A common method for doing so is to use the salary and seniority data for men to find a regression equation to predict expected salary when seniority is known. The equation is then used by predicting what each woman's salary should be, given her seniority, and comparing it with her actual salary. The differences between actual and predicted salaries are then averaged over all of the women faculty members, to see if on average they are higher or lower than they would be if the equation based on the men's salaries worked for them.
"Tomlinson-Keasey, Utts and Strand (1994) used this method to study salary differences between male and female faculty members at the University of California at Davis. They divided faculty into 11 separate groups, by subject matter, in order to make more useful comparisons.
"In each of the 11 groups, the researchers found that the women's actual pay was lower than what would be predicted from the regression equation and they concluded that the situation should be investigated further. However, for some of the subject matter groups, the difference found was not statistically significant. The researchers' conclusion that there was a problem that needed to be studied further generated some criticism on that basis.
"Let's look at how large a difference would have had to exist in order for the study to be statistically significant. We will use the data from the humanities group as an example. There were 92 men and 51 women included in that analysis. The mean difference between men's and women's salaries, after accounting for seniority and years since Ph.D., was $3,612.
"If we were to assume that the data came from some larger population of faculty members and test the null hypothesis that men and women were paid equally, then the p-value for the test would be 0.08. Thus, a statistically naive reader could conclude that there is no problem, since the study found no statistically significant difference between average salaries for men and for women, adjusted for seniority.
"Because of the natural variability in salaries, even after adjusting for seniority, the sample means would have to differ by over $4,000 per year for samples of this size in order to be able to declare the difference to be statistically significant.
"The conclusion that there is not a statistically significant difference between men's and women's salaries does not imply that there is not an important difference. It simply means that the natural variability in salaries is so large that a very large difference in means would be required to achieve statistical significance.
"As one student suggested, the male faculty who are complaining about the study's conclusions because the differences are not statistically significant should donate the "nonsignificant" amount of $3,612 to help a student pay fees next year. (Utts (1996), pp 409-410)
Return to
Dave Howell's Statistical Home Page
Last revised: 01/14/2002