header.jpg (15348 bytes)


Class 1--Introduction

8/28/01

Overview of course

1. Descriptive Statistics

a. Standard descriptive statistics

b. Graphics

c. Probability

2. Inferential Statistics

a. Hypothesis testing

b. Randomization tests

c. Chi-square

d. t-tests

e. Correlation and regression

f. Analysis of variance

g. Multiple comparison techniques

3. Computing

a. Major element of the course

1. Most will be done on PCs using SPSS.

2. The emphasis will be on using computers to understand statistics

b. We will alternate lecture and lab

c. Electronic mail

1. We will pass messages back and forth, and I’ll be available for questions that way (as well as in person).

2. We’ll look at the internet and how to use it for our particular purposes.

d. The World Wide Web

1. Describe how to use (access) it for this course

2. Tell them about my web pages and give addresses.

e. NO PIRACY

4. Interpretation

a. Students seem to have a hard time going from data to interpretation. An important chunk of this course involves interpretation of results.

b. I want to stress interpretation. Answers to the questions should not just be numbers.

c. Use of real data sets (or close to that).

5.  I'm going to organize most classes around a "study" rather than around a statistical "topic"

Texts

1. Howell, D. C. Statistical Methods for Psychology, 5th edition

2. Miscellaneous handouts

3. I expect people to do the reading before the class.

a. This will be essential for the kind of course I envision (especially for the lab part).

b. Not everything in the book needs to be learned.

c. Formulae are generally not for memorization

 

Homework/Lab assignments

1. Much of the work will involve working problems in class on Thursday.

2. Whenever I assign homework or lab problems, I want them to be turned in—even if I forget to say to at the time.

3. Students should have a 3.5" disk to save stuff on.

4. Make sure they all have zoo accounts and know how to log on and read mail. To get a zoo account, they can go to Obtaining an account on Zoo. The dial-up number can be found at Dial-in Lines.

 

Grading

1. 1 midterm and 1 final (45% each)

2. Assignments (10%)

 

Office Hours

1. By appointment (though I am usually in).

2. They can call me at home (872-1585)

3. NOT before class

First Assignment

1. Read Chapters 1 and 2 of Howell

a. Most of this is straight review

b. Go easy on the graphics this time around

c. I want everyone to come to class with at least one question about the material there.

 

Start of Course. 

The following is a modified version of the first class that we had two years ago. In that class I basically lectured on this material. This year I want to begin to see how we can get away from standard lectures--at least occasionally. I want you to read the following material carefully, think about the questions I ask, even though I give many of the answers I am seeking, and come back to the next class ready to discuss much of this. We will not routinely conduct classes this way, but I wanted to try at least once.

 

Seligman, Nolen-Hoeksema, Thornton, and Thornton (1990) ran a simple experiment to examine how optimists and pessimists respond to failure. They took 33 male and female members of the swim teams at the University of California at Berkeley. Each subject took the Attributional Style Questionnaire (ASQ; Peterson, Semmel, von Baeyer, Abramson, Metalsky, and Seligman, 1982). This is a self-report scale assessing how subjects respond to positive and negative events along three dimensions (stable-unstable, global-specific, and internal-external). The scores for positive events are summed across those three dimensions to create a composite-positive score, and negative events are summed to create a composite-negative (CN) score. For today we are focusing on the CN score because we are interested in how subjects respond to negative events.

At a team practice, all subjects were asked to swim their best event as fast possible, but in each case the time that was reported was falsified to indicate poorer than expected performance, hence disappointing each swimmer and presenting them with a negative outcome. Half an hour later, each swimmer was again asked to perform, and their times were recorded.

According to theory, optimistic subjects, when presented with a negative event, would have a positive outlook for the future, would try harder, and thus should do better on the second trial (taking into account any fatigue from the first trial). On the other hand, Seligman et al. predicted that pessimistic subjects would not voluntarily try harder on the second trial, but would have an expectation that there is little that they can do. They would not be expected to do better on a retry.

The dependent variable for this analysis was the ratio of Time2/Time1. Any ratio greater than 1 would mean that the subject did worse on the second trial, while a ratio less than 1 would indicate better performance. Thus Seligman et al. would expect higher values for pessimistic subjects.

I am going to use this example to illustrate many aspects of this course, including definitions and ways to go about stating and examining an hypothesis. I am going way beyond Chapter 2. The data are available at pessimism.dat as a raw data file and at pessimism.sav as an SPSS system file. In a sense this will be an overview of many different statistical approaches to making sense of data. I want people to get that overview, not worry about the specifics.

  • What are the variables?

  • How would you describe those variables?

  • What is the hypothesis behind the study?

  • How might we examine this hypothesis?

  • What do we gain or lose by dichotomizing optimism?

  • Why did they use the t2/t1 ratio?

    • What else could they have used?

    • How might we decide what to use?

    • What would happen if we made a different decision?

The data for this experiment appear below. They are in line with the data that Seligman et al. found, having the same means and standard deviations.. The first column is the ratio of Time2/Time1. The second column is the subject’s pessimism score. The third column (G2) creates 2 groups by breaking the data at the median with respect to Pessimism. (I split them by flipping a coin when they were tied.) The last column (G3) breaks the subjects into 3 groups (1 = low, 2 = medium, and 3 = high pessimism.).

Ratio       Pessim.   G2      G3 Ratio       Pessim.   G2     G3



0.9833    10    1    1

1.0447     9    1    1

1.0323    12    1    2

0.9846    13    2    3

1.1075    13    2    2

1.0748    11    1    2

1.0435    17    2    3

0.9518    15    2    3

0.9980    13    2    3

0.9139    11    1    1

0.9548    11    1    1

1.0017    12    2    2

1.0771    13    2    2

0.9749     9    1    1

1.0255    14    2    3

1.0454    13    2    2

0.9619    11    1    1

 

0.9441     9    1    1

0.9658    15    2    3

1.0410    12    1    2

0.9226    13    2    2

1.0000    13    2    2

0.9313    11    1    1

0.9363    10    1    1

0.9985    11    1    2

0.8719    11    1    2

1.0029    14    2    3

0.9344    14    2    3

0.9450    10    1    1

1.0098    14    2    3

0.8640     7    1    1

1.0645    15    2    3

1.0525    15    2    3

 

  • What do you think of my idea of splitting the subjects by flipping a coin when they were tied for the same pessimism score and I needed only 4 people with 11’s in Group 1?

  • The results of dichotomizing optimism follow. (Group 1 = Pessimists)

Report

RATIO

G2

Mean

N

   Std.    Deviation

1.00

.9670

16

6.033E-02

2.00

1.0110

17

5.067E-02

Total

.9897

33

5.906E-02

  • What would you conclude?
  • The results of trichotomizing optimism follow.

Report

RATIO

G3

Mean

N

Std. Deviation

1.00

.9504

11

4.490E-02

2.00

1.0157

11

6.897E-02

3.00

1.0030

11

4.183E-02

Total

.9897

33

5.906E-02

  • What would you conclude?

Finally, I have drawn a diagram plotting the ratio of t2/t1 against the pessimism score. It follows.

  • What conclusion would you draw from this plot?

    • Is this a better test of the hypothesis than the previous ones?

    • How can you decide?

  • What conclusions would you draw from the results of this experiment?

  • This example should be kept in mind when students read the introductory chapters. How does it fit with what is there?

    • Why didn't I plot the distribution of each of the variables?

What can I do to better fit this example with the first several chapters? Perhaps I should assign that to them as a problem.

 

An Alternative Approach--Categorical Data

I have split pessimism at the median. I could also split the ratio at the median or some other point.

I split it at 1, because ratios less than 1 represented poorer performance, and ratios greater than 1 represented better performance.

This leads to the following Contingency table:

Notice that 12 out of 16 Optimists improved, while 11 out of 17 Pessimists got worse.

A statistical test on this would clearly be significant.

This example should not be taken as an indication that median splits (especially with two variables) are a good idea. There is a lot to suggest that median splits like this are a bad idea. We will discuss this later in the semester.

Summary of Example

This example actually represents an overview of the entire course, though at a very elementary level.

  • We saw something about how to create variables, and how to distinguish between dependent and independent variables.
  • We stated several hypotheses that linked the experiment to the anticipated results.
  • We looked at comparing the means of two groups which differed on the independent variable. This is an example of a t test.
  • We looked at comparing the means of three groups--this is a lead in to the analysis of variance.
  • We looked specifically at the relationship between two variables (pessimism and performance) and calculated a correlation coefficient.
  • We looked at categorizing both variables and setting up a contingency table. This is a chi-square test.
  • We saw that everywhere we looked, we were basically getting at relationships (whether through correlation, analysis of variance, or chi-square.

Not all of these approaches are equally valuable, but they are all possible. They ask slightly different questions, but get at the same overall relationship.

The rest of the course will focus on each of these techniques in turn, after we have looked at some basic material.

Last revised: 08/24/01