Answers to Chi-square Lab #2

10/4/01

This page is intended for partial answers to some of the questions raised in this lab. It is not my intent to answer everything. I have amended it on 6/3/2003 because someone cited it as semi-authoritative, and when I read it over I found things that I wish I had said better..

The data table was:

 

Survived

Died

Total

Ritonavir

472

71

543

Placebo

399

148

547

Total

871

219

1090

Results

This table should be pretty clear. The chi-square is the one that we would calculate by hand. The Continuity correction is something that I suggest in the book that you ignore, so I won't expand on that. The likelihood ratio is another way of computing chi-square, which also has a chi-square distribution assymptotically, which means that it would be distributed as chi-square if we had an infinite number of cases. (Actually, both Pearson's chi-square and the Likelihood ratio chi-square are asymptotically distributed as the chi-square distribution. Neither are exactly chi-square for N < infinity, although they both approach it rapidly. Pearson is usually a better statistic for small sample sizes.) Fisher's exact test is just that--it is exact, but only if we assume that the the row and column totals are fixed, or, alternatively, if we wish to take Fisher's "resampling" approach to hypothesis testing. (But there are strong arguments why fixed marginals (called conditioning on the marginals) is not a bad idea.)

  

This table has some problems as far as I am concerned. It isn't that the numbers are wrong, but that the labels don't seem to tell me what the numbers tell me.

First of all, the odds ratio is correct, but it refers to the ratio of the odds of dying if you are in the Placebo group relative to the odds of dying if you are in the Ritonavir group. The odds of dying are 2.466 times greater if you are in the Placebo group. Taking the inverse, the odds of dying are .406 times smaller if you get Ritonavir.  Those numbers are just the inverse (i.e. 1/x) of each other. I know that is what those numbers mean, but I can't get my head to read the row label in a way that says that.

Now go down a row. I thought, erroneously, that I would be getting the odds for the Died group and the odds for the Survived group in those next two rows. That is not what I get. First of all, the table is headed Risk Estimate, and we got it by clicking Risk in the dialog box. But risk and odds are different things. Risk refers to the kinds of numbers I like, while odds refers to the kinds of numbers bookies like. The odds of dying if you are in the Ritonavir group are 71/472 = .1504. The risk of dying if you are in the Ritonavir group are 71/(472 + 71) = 71/543 = .1308. This says that 13% of the Ritonavir subjects died. Of course, neither of these numbers appears in the table above. But if you take the risk of dying in the Placebo group, divided by the risk of dying in the Ritonavir group, you get .2707/.1308 = 2.069, which is a number you see above. In other words, the second line is giving you risk ratios, rather than odds ratios. and it is the relative risk of dying. The relative risk of surviving if you were in the Placebo group is .839.

So, I know what the numbers mean, but I don't see any way of pairing that knowledge with what the labels on the rows say. I'm sure that there is an obvious explanation, but I can't see it. If anyone else does, please let me know.

But while I am here, I should tell you a few things that will be helpful. First of all, you would do well to read something about risk and relative risk. A very brief search of Google turned up http://www.shef.ac.uk/uni/projects/wrp/risk.html. This is a very nice web page, partially because it was written by someone who was using numbers to try to figure out stuff like this--in other words, it wasn't written by someone who knew too much. I recommend it.

The second thing to know is how to access the help system in SPSS to the best advantage. First of all, go to the SPSS output page and double click on the output table, then clicked once on "For cohort OUTCOME = Died," and then right click. This will give you a menu, which gave a choice of "What's this." Click on that and you get "Displays the estimate of the relative risk for the defined event." Well, that tells what sent me off to thinking about risk rather than odds, so it was worthwhile. 

Next, do the same thing, except that instead of selecting "What's this?" select "Results coach." That gives you access to the SPSS help system, and can be very useful. It didn't particularly help me this time, but it does for other things. You should know about it.


2 & 3. These problems added an additional group to Siegel's study, which makes it difficult for form an overall odds ratio. An odds ratio is the ratio of two odds, and here you have three. The point that I wanted you to see is that you can use odds ratios in this situation if you can defend using one group as a "control" group--or a base group. The data for 30 subjects in each group were

 

Survived

Die

Group A

21

9

Group B

11

19

Group C

1

29

Group A is a group that received 4 increasing injections of morphine in the same setting. It seems to me that it might be useful to ask what happens to the odds of survival if you compare that condition to a group that received morphine in a new setting. This gives you 
    odds ratio (B/A) = (19/11) / (9/21) = 4.03, 
meaning that the rat's odds of dying  are about 4 times greater when you switch the context than when you leave it the same.

(Notice that in solving that problem I just used one formula. I merely calculated the odds for group B in the numerator and the odds for group A in the denominator, and then divided those two. Nothing sneaky.)

Next we can look at what morphine tolerance does for you. Group A has built up tolerance by being injected 4 times in Context A. Group C has had saline for the first three doses, so it is getting morphine for the first time. Both groups get all their injections in the same context. The odds ratio here is
    odds ratio(C/A) = (29/1) / (9/21) =  67.67
meaning that the rat's odds of dying are 67.67 times greater when he has not had the opportunity to build up morphine tolerance. 


4. In the book I talk about the Dabbs and Morris (1991) study of the relationship between delinquency and testosterone. Below are the data separated by High and Low SES subjects. Run an appropriate analysis within each SES group. What conclusions would you draw from these data?

Low SES

 

Delinquent

Not Delinq.

Normal

190

1104

High

63

140

For the low SES group there is a clear relationship between testosterone and delinquency. High testosterone participants are much more likely to have a history of delinquency. The odds ratio for delinquency is (63/140) / 190/1104) = 2.61.

High SES

 

Delinquent

Not Delinq.

Normal

53

1114

High

3

70

 For the high SES group there is no relationship between testosterone and delinquency. High testosterone participants are no more likely to have a history of delinquency than the normal group. The odds ratio for delinquency is (3/70) / 53/1114) = 0.90. The chi-square was .03, for a clearly nonsignificant result, so we have no reason to think that the odds ratio differs from 1.00, which would be the ideal case under the null.


5.  Last class I mentioned "effect size," and suggested that odds ratios are more than one way to get a measure of the size of an effect. The basic idea is that we could have a statistically significant effect that is trivial, and another statistically significant effect that is important. 

I chose to use the example that Rosenthal used, because it gives us a good way to see the difference, and to worry about which one is "correct."

The data follow:

 

Heart Attack

No Heart Attack

Aspirin

104

10,933

Placebo

189

10,845

The chi-square statistic computed on this sample is 25.013, which has a probability under the null of .000000569. We can clearly reject the null, but is it worth it? The difference looks to be very small. When you are talking about nearly 11,000 subjects, 85 deaths doesn't look all that big.

One approach would be to calculate Cramer's Phi, as a measure of association. This is given by SPSS as .034, and you will see that this is a pretty small number. (Cramer's phi can range from 0 to 1.00, and this is pretty close to 0.)

An alternative approach would be to look at odds ratios. The odds of having a heart attack if you are in the Placebo group are 189/10845 = .017. The odds of a heart attack if you are in the aspirin group are .0095. This gives is an odds ratio of .017/.0095 = 1.79. The odds of having a heart attack if you do not take aspirin are 1.79 greater than if you do. That is why I pop an aspirin every morning.

Rosenthal's point was that this was a very important study, with a result so important that they cancelled the study early and told everyone to take aspirin, and yet it has a very small measure of association.

Last revised: 06/04/03