|
|
|
|
|
Survived |
Died |
Totals |
|
Group 1 |
21 |
9 |
30 |
Group 2 |
11 |
19 |
30 |
Totals |
32 |
28 |
60 |
The dependent variable is the number of rats who survived and the number who died.
Survived |
Died |
Totals |
|
Group 1 |
.70 |
.30 |
1.00 |
Group 2 |
.367 |
.633 |
1.00 |
Average |
.533 |
.467 |
1.00 |
30% of the rats in Group 1, and 63% of the rats in Group 2 died from morphine overdose.
Now we are going to forget Siegel's actual data for a moment and create some new data that we would expect to find if the null hypothesis is true, and then compute data we would expect to find if the null is false. For each set of data we will calculate a bunch of chi-square test statistics. I want you to see what these statistics look like under the true and the false null hypotheses. Finally, we will go back to Siegel's data and calculate a chi-square for them and draw a conclusion.
Null hypothesis true:
We are going to start with the assumption that the probability of survival is the same in each condition. Since 53.33% of Siegel's rats survived overall, we will assume that .5333 is the probability of survival in each condition. (Note that we have just stated the assumption that the null hypothesis is true.)
For the null = true condition, we will create 30 subjects in each of 2 groups (N = 60), and we will set the probability of survival = .5333 for each group. Each of you will repeat this experiment 15 times, for a total of 150 replications. Then we will look at the combined results as an illustration of the chi-square distribution.
We will do this by using SPSS to draw random samples. The software (syntax) will assign pseudo-rats to the Survived vs. Died outcome on the basis of random numbers. (For example, if I draw numbers uniformly distributed between 0 and 1, I will call a Group 1 animal a survivor if his/her random number is < .5333. Otherwise he will be classed as a victim of drug overdose. Because we are generating results under conditions where the null hypothesis is true, I will also call a Group 2 animal a survivor if his/her random number is < .53333. Otherwise he will be classed as Died. You should be able to see that over the long term this will mean that 53.3% of the animals in both groups will survive, and 46.7% of the animals in both groups will die. However, that wont necessarily be the result for any given sample of 30 rats In fact, it probably won't be.
The following program looks a bit clumsy because it generates data separately for the two groups, even though the population proportions of survival are the same. I have done that so that it is simple to modified the program for a false null hypothesis.
To generate data, do the following:
Start SPSS
Set a random seed. You can just go to Transform/Random Number Seed, and take the default, although we had problems with that. Just type in a big number..
Create a variable named Group with 30 1s and 30 2s. This just sets up the data file for 60 animals and assigns them to groups.
Now we need to create some outcomes. We will first give everyone a 1, to make them all into survivors. That is just for a starting point. Then we will kill off a bunch. Enter the following as syntax statements--note carefully the placement of parentheses. Note also that "outcomexx" is spelled "outcomxx" so as not to exceed 8 characters.
COMPUTE outcom1 = 1 .
IF (((Group = 1) and (rv.uniform(0,1) gt .5333)) or ((Group = 2) and (rv.uniform(0,1)
gt .5333))) outcom1 = 2 .
EXECUTE .
(Note, the notation "gt" means "greater than," just as "ge" means greater than or equal to." Be sure to leave spaces around "ge", "and", and "or.")
Next you want to copy and paste this 14 more times, editing it to create variables named outcom2, ..., outcom15.
To see what the results look like, invoke Analyze/Descriptive statistics/CrossTabs. Put Group on the Rows, and Outcom1, Outcom2, ..., Outcom15 on the columns. You must also click the Statistics button at the bottom of the dialog box, and then chose chi-square. You will get fifteen tables that look like the one above, each of which will have a chi-square statistic.
The chi-square statistic is a measure of the degree to which the Survive/Die ratio is the same or different in the two groups. (If exactly the same number died in each group, chi-square would be 0. If the two groups have quite different survival rates, the chi-square values should be large.).
Neatly record your tables and their corresponding chi-square values (to two decimal places), and pass me a sheet with the fifteen chi-squares on it.
STOP!!! Go back and reread that last sentence about the number of decimal places!! Surprisingly, everyone got that right last week. That is a first!!
Ill record the data and make the file available to you. (You do not have to give me the actual cell frequencies, only the values of chi-square. Don't run away when you have done that, because I want to give the compiled results back to you.
After the results have been compiled into a single data file, you should plot a histogram of the resulting chi-square values. We will then discuss chi-square in class on Tuesday, and you can see if your results looked like what you should expect with repeated sampling.
Print out the output page so that you have a record of your results.
Now I want you to do this all over again, but this time generate data where the null hypothesis is false. In this case the probability of dying is much lower for one group than for the other.
Saying "the null is false" is very imprecise, because it doesn't say how false. I have to pick some values, and I'll use 70% survivors for the Same Context group, and 63.33% survivors for the Different Context group. That is a pretty small difference, but probably one that is big enough to see.
We will probably not have time to generate 15 sets of results when the null is false, but the following code would do so.
COMPUTE outcom1 = 1 .
IF (((Group = 1) and (rv.uniform(0,1) gt .
You can create 15 copies of these statements, modify them to label the variables outcome1 ... Outcom15, compute the data, and then compute the tables and chi-square. Note that these chi-square values are appreciably higher on average, but there are very likely to be one or two small values as well.
Now I want you to calculate chi-square for Siegel's data. Add a column labeled Siegel, and put in 1's and 2's in correspondence with the table at the beginning of this assignment. In other words, you will have 21 1's and 9 2's for the animals in Group 1, and 11 1's and 19 2's for the animals in Group 2. Now run chi-sq on this.
I have created a page showing the results of the first part of this lab. It contains both the histogram and the frequency distribution. You could recreate the output complete from the frequency distribution.
Last revised: 09/27/01