Demonstration of Correlations when
Populations Correlation is 0.00

David C. Howell

bar bar

For this example I used SPSS to generate five variables as random samples (of 20 cases each) from a normally distributed population. These samples are independent of each other, and the population correlation would be 0.0. The results, without the data, are shown below. Notice that the intercorrelation matrix shows you the correlation, below that the sample size, and below that the two-tailed significance level. (Thus, for example, when the true correlation between X1 and X2 in the population is 0.00, a sample correlation as extreme as .1127 would occur 63.6 percent of the time.)


                      - -  Correlation Coefficients  - -



             X1         X2         X3         X4         X5



X1           1.0000     -.1127      .2541     -.3364      .1563

            (   20)    (   20)    (   20)    (   20)    (   20)

            P= .       P= .636    P= .280    P= .147    P= .511



X2           -.1127     1.0000     -.1044      .1905      .0451

            (   20)    (   20)    (   20)    (   20)    (   20)

            P= .636    P= .       P= .661    P= .421    P= .850



X3            .2541     -.1044     1.0000     -.1739      .3960

            (   20)    (   20)    (   20)    (   20)    (   20)

            P= .280    P= .661    P= .       P= .464    P= .084



X4           -.3364      .1905     -.1739     1.0000     -.1503

            (   20)    (   20)    (   20)    (   20)    (   20)

            P= .147    P= .421    P= .464    P= .       P= .527



X5            .1563      .0451      .3960     -.1503     1.0000

            (   20)    (   20)    (   20)    (   20)    (   20)

            P= .511    P= .850    P= .084    P= .527    P= .





(Coefficient / (Cases) / 2-tailed Significance)

A scatterplot of these data follows:

x1x5corr.gif (3553 bytes)

What if we increase the sample size?

To give you a sense of the relationship between sample size and the variablility of correlation coefficients, I have repeated the previous example, but this time I have generated 200 cases. Because the correlations are based on much more data, they should hover more closely around the true population correlation of 0.00. Can you see this in the following set of data?

                       - -  Correlation Coefficients  - -



             X1         X2         X3         X4         X5



X1           1.0000     -.0002      .0500      .0236      .0072

            (  200)    (  200)    (  200)    (  200)    (  200)

            P= .       P= .998    P= .482    P= .741    P= .919



X2           -.0002     1.0000     -.0378      .1233      .0306

            (  200)    (  200)    (  200)    (  200)    (  200)

            P= .998    P= .       P= .595    P= .082    P= .667



X3            .0500     -.0378     1.0000      .1810     -.0225

            (  200)    (  200)    (  200)    (  200)    (  200)

            P= .482    P= .595    P= .       P= .010    P= .751



X4            .0236      .1233      .1810     1.0000     -.0168

            (  200)    (  200)    (  200)    (  200)    (  200)

            P= .741    P= .082    P= .010    P= .       P= .814



X5            .0072      .0306     -.0225     -.0168     1.0000

            (  200)    (  200)    (  200)    (  200)    (  200)

            P= .919    P= .667    P= .751    P= .814    P= .





(Coefficient / (Cases) / 2-tailed Significance)



" . " is printed if a coefficient cannot be computed

Notice that there is one Type I error here. (Remember that a Type I error consists of rejecting the null hypothesis when it is in fact true. Since I drew all of my samples independently, the true correlation is the population would in fact be 0.00.) Can you find the Type I error? What do you think happens to the probability of a Type I error when we work at a = .05, but run many hypothesis tests? (How many tests did we actually run here?)

bar bar

 

Last revised: 7/13/98