
Jo Eppings practicum presentation looked at cancer data. There is an argument in the literature that psychological factors play a role in survival from cancer. One important variable seems to be "avoidance" Patients with a high incidence of reported avoidance is believed to have a poorer outcome. The data are available at Jo.sav and Jo.dat.
She divided her subjects into two groups. Group 1 was classed as a success because those subjects were cancer free at followup. Group 2 were classed as Failed because they were either not cancer free or had died at followup.
The dependent variable is the reported level of avoidance at Time 1
Have them run this test using SPSS
Data
Group 1 (Success) 2 (Fail) Mean 14.41 17.00 s.d. 21.247 18.706 ni 49 18
Pooled estimates
Here we will reject the null hypothesis that the two groups show equal levels of avoidanceThe exact probability under H0 =is .0434.
A good psychologist now would go out and compute a measure of the magnitude of the difference. I might suggest Cohen's d.
But suppose that we treated this as a problem in correlation.
Correlational Approach
- The Outcome variable is coded 1 and 2, and we could correlate that with Avoidance.
- I'm going to run this with the regression procedure rather than the correlation procedure, just because it gives me more information to play with.
- That means I have to decide which is the dependent variable and which is the independent variable. Since survival comes later in time than avoidance, it makes sense to make that the dependent variable.
- Have them do this using SPSS
- SPSS printout:



- Notice that the correlation is .248. The test on the slope is also a test on the correlation (with one predictor), and it shows that the correlation has a t of 2.061 and a p = .043, which is exactly what we found with the t test.
- First of all, this shows us that we are really asking the same kind of question.
- Second, note the slope = .0237. It tells us that a one unit change in avoidance leads to a .0237 unit change in survival.
- The beta translates this to standard deviation units, which is more meaningful.
Magnitude of effect
- When we talked about correlation, we talked about r2 as the percentage of variability in one variable accounted for by variability in another.
- Our r2 here is .06, which says that about 6% of the variability in survivability is accounted for by variability in avoidance.
- that's not huge, but it's something.
- But this really tells us that any 2-sample t can be converted to an r2 type measure.
- We could either do this by rerunning as a correlation, or by a simple formula that does the same thing.
- If we just want to have a whole matrix of correlations (i.e. our interest isn't really in predicting), then the correlation we want is the point-biserial when we have a dichotomous predictor.
- But keep in mind that the way we calculate the point-biserial is just to apply the plain old Pearson formula.
Logistic Regression
- I'll talk about this next semester, but not now.
- I just want to say that logistic regression is a better way of predicting a dichotomy, though it is not much better unless the odds of survival get very low or very high for some values of the predictor.
Pet Yes No Total Alive 50 28 78 Dead 3 11 14 Total 53 39 92


In the book I give a pretty limp suggestion of what to do when you have a chi-square contingency table and actually have an ordinal prediction. I do a much better job at the following web site.
Last revised: 11/05/01