Descriptive Statistics and 
Other Goodies

9/4/01


Announcements:

    Mail accounts

Experimental Design as seen in Optimism Study

    Review study

  • What are the variables?

  • How would you describe those variables?

  • What is the hypothesis behind the study?

  • How might we examine this hypothesis?

  • What do we gain or lose by dichotomizing optimism?

  • Why did they use the t2/t1 ratio?

    • What else could they have used?

    • How might we decide what to use?

    • What would happen if we made a different decision?

  • What do you think of my idea of splitting the subjects by flipping a coin when they were tied for the same pessimism score and I needed only 4 people with 11’s in Group 1?

    • Note that I have unintentionally committed a common error here. I have labeled the variable as Pessimism, but then talk about Optimism. It can get very confusing when you do that, and we need to be careful.

  • The results of dichotomizing optimism follow. (Group 1 = Optimists)

Report

RATIO

G2

Mean

N

   Std.    Deviation

1.00

.9670

16

6.033E-02

2.00

1.0110

17

5.067E-02

Total

.9897

33

5.906E-02

  • (Comment on E-notation)
  • What would you conclude?
  • The results of trichotomizing optimism follow.

Report

RATIO

G3

Mean

N

Std. Deviation

1.00

.9504

11

4.490E-02

2.00

1.0157

11

6.897E-02

3.00

1.0030

11

4.183E-02

Total

.9897

33

5.906E-02

  • What would you conclude?

Finally, I have drawn a diagram plotting the ratio of t2/t1 against the pessimism score (notice that I have switched to Pessimism). It follows.

  • What conclusion would you draw from this plot?

    • Is this a better test of the hypothesis than the previous ones?

    • How can you decide?

  • What conclusions would you draw from the results of this experiment?

  • This example should be kept in mind when students read the introductory chapters. How does it fit with what is there?

    • Why didn't I plot the distribution of each of the variables?

What can I do to better fit this example with the first several chapters? Perhaps I should assign that to them as a problem.

 

An Alternative Approach--Categorical Data

I have split pessimism at the median. I could also split the ratio at the median or some other point. In this case I would have TWO categorical variables.

I split it at 1, because ratios less than 1 represented poorer performance, and ratios greater than 1 represented better performance.

This leads to the following Contingency table:

Notice that 12 out of 16 Optimists improved, while 11 out of 17 Pessimists got worse.

A statistical test on this would clearly be significant.

This example should not be taken as an indication that median splits (especially with two variables) are a good idea. There is a lot to suggest that median splits like this are a bad idea. We will discuss this later in the semester.

Summary of Example

This example actually represents an overview of the entire course, though at a very elementary level.

  • We saw something about how to create variables, and how to distinguish between dependent and independent variables.
  • We stated several hypotheses that linked the experiment to the anticipated results.
  • We looked at comparing the means of two groups which differed on the independent variable. This is an example of a t test.
  • We looked at comparing the means of three groups--this is a lead in to the analysis of variance.
  • We looked specifically at the relationship between two variables (pessimism and performance) and calculated a correlation coefficient.
  • We looked at categorizing both variables and setting up a contingency table. This is a chi-square test.
  • We saw that everywhere we looked, we were basically getting at relationships (whether through correlation, analysis of variance, or chi-square.

Not all of these approaches are equally valuable, but they are all possible. They ask slightly different questions, but get at the same overall relationship.

The rest of the course will focus on each of these techniques in turn, after we have looked at some basic material.

Describing Data:

Purpose:

This started out as a lecture on descriptive statistics, but it is more than that. It is intended to illustrate simple ways of looking at (and describing) a set of data. We're interested in much broader things than means and standard deviations. Here again I am using an example to illustrate a number of points, but don't forget that the example is what it is all about, not the statistical procedures.

Example (Air Quality)

The air quality example is one I have used for several years, because it illustrates a number of issues, and is easy to understand. Every time that I use it, I discover something new.

If students would like, they can think of these as two measures of some personality attribute, or two measures of depression, or two measures of learning, etc. The major points are valid regardless of what we are measuring, and, in fact, we are looking at issues that pop up in psychology all the time, even though I am talking about air quality.

After using this example for two years, I came across a strikingly similar idea on a web page maintained by Steve Simon. He pretended that he was acting as a consultant to two nurses who wanted to evaluate replacing the standard glass-mercury thermometer with a new Tempa-Dot thermometer. I recommend his page at http://www.cmh.edu/stats/case200.htm .

Describe the background--where did the idea come from.

Air Quality

Describe the data.

The raw data are available at airqual.dat The SPSS file is available at airqual.sav

Inst1 Inst2 Diff Trial
258 275 -17 1
387 359 28 2
258 288 -30 3
179 186 -7 4
447 384 63 5
293 303 -10 6
282 327 -45 7
128 155 -27 8
237 247 -10 9
292 300 -8 10
1 65 -64 11
256 240 16 12
89 147 -58 13
299 343 -44 14
180 205 -25 15
193 217 -24 16
338 355 -17 17
279 291 -12 18
236 276 -40 19
110 160 -50 20
251 266 -15 21
358 329 29 22
410 377 33 23
350 380 -30 24
42 84 -42 25
202 213 -11 26
189 247 -58 27
245 310 -65 28
165 193 -28 29
510 394 116 30
330 374 -44 31
293 327 -34 32
202 214 -12 33
316 335 -19 34
154 178 -24 35
151 168 -17 36
202 228 -26 37
264 296 -32 38
107 145 -38 39
336 365 -29 40
279 299 -20 41
70 130 -60 42
169 144 25 43
228 228 0 44
297 309 -12 45
625 510 215 46
264 338 -74 47
254 260 -6 48
319 345 -26 49
309 340 -31 50

 

What would students expect from such a set of data.

There are at least two models, or underlying processes behind these data..

  • It could be that the underlying measure is constant, that we are just taking different measures of the same thing each day. e.g. go out each day and measure the altitude of Mt. Mansfield.
  • It could be that the underlying measure is variable, and that we are really measuring something different each day. e.g.. measuring daily temperature.
  • In the first case, Mt/ Mansfield doesn't change its height on a daily basis, so the only source of variability is errors of measurement. In the second case there is variability due to measurement, but also variability do to changing true temperature. Both of these are combined in the data, though we can separate them a bit.
  • In this example the answer may, or may not, be obvious, depending what we are trying to measure about air quality.

One psychological parallel here would be measuring traits (extroversion) versus state (mood).

In each case, for a single measure we have a True Score (The True Temperature, that God knows) and an error score (the bit of error associated with that measurement.) The latter comes from the fact that your thermometer, or whatever, is never perfectly accurate, nor are your eyes.)

Y = True Score + error

or, more commonly, Y = True Score + error (when we realize that error can be negative.)

Ask them to think possible sources of error in things that are not stable over time

  • just in measuring extroversion.
  • in measuring "happiness"

What are the likely sources of error if the underlying trait being measured is stable

  • The height of Mt. Mansfield
  • John Smith’s degree of extroversion
  • Mary Smith’s intelligence

One of the most important sources of error is plain old "sampling error." Elaborate

What would our measurements look like?

Think about the differences between models.

  • In the first model, the only difference between one measurement and another is error variance.
  • In the second model the true score itself changes from one day to another. So we have some sort of true variability.

What would the measures of central tendency and dispersion be

        (we’ll put this off for a minute or two)

What would the shape be?

This can get us to a nice discussion of normality of errors.

Assume first a stable underlying trait

Assume next a fluctuating underlying trait

I want them to see that there is much more error in the second situation.

I realize that these notes are getting to be a bit redundant as a result of revisions, but the points are still legitimate.

Bring this back to the idea of a model as a way of thinking about what your data should look like. 

Plotting the data

Start out with the SPSS histogram

wpe1.jpg (17753 bytes)

You can control the width of the bars by editing the graph, but these are fine for what I want.

Comment on the normal curve that I chose to superimpose.

Comment on the apparent outlier(s)

Show Instrument 2

wpe2.jpg (18778 bytes)

Compare the distributions for Instruments 1 and 2

Notice that there are more outliers for Instrument 1 (at least at the high end), and that Instrument 2 is slightly negatively skewed.

We can show boxplots for SPSS

Ask what this tells us (if anything) about true and error scores?

I don't see that it tells us much.

Ask how, if at all, this depends on the idea of a fixed, versus moving, true score.

Visually we can't distinguish between them, because we don't have any way to separate error from true drift over time..

What can we conclude from what we see here?

Scatterplots

I am not covering scatterplots until Chapter 9, but we certainly should be able to discuss them here.

Show the scatterplot from SPSS.

wpe4.jpg (13999 bytes)

What does this tell use about how our instruments are measuring?

Notice that the best fitting line doesn’t go through 0,0 and 500, 500. What does that mean?

Draw that line in and see if it tells us anything. (Here we’re talking about the slope.)

  • The slope is 0.70.
    • Can they tell me what that means?

Can we say anything about the intercept?

  • The intercept is 89.

Time series data

These are trials over days (at least if we think of them as Air Quality measures).

We could plot the data as a function of time, to see if there are trends.

I did this for Instrument A—nothing

wpe8.jpg (8779 bytes)

Nothing with Instrument B either

We could look at the difference between Instrument A and Instrument B, as a function of time.

Do the instruments differ more or less over time?

  • This would happen if one of them were deteriorating in some way.

Show this scatterplot (time series)

wpe9.jpg (8284 bytes)

Show what happens when you superimpose a line at Y = 0.

wpe5.jpg (12381 bytes)

Ask why I might do this.

Ask what they see

 

Mean and Standard Deviation

For Inst. 1, Mean = 192.66; For Inst 2 Mean = 206.98

What does this tell us?

Why would we be more (or less) interested in the mean than the median?

efficiency

For Inst 1 s = 113.95; for Inst 2 s = 86.57

(or s1 = 113.95, s2 = 86.57)

What does this tell us?

How can we relate the standard deviation to the two histograms?

Internal and External Validity

External validity refers to how well the statistic in question can represent the population.

With random sampling we have external validity (assuming lots of other stuff). Do we have external validity here?

Internal validity refers to the integrity of the study

We must have random assignment to have a chance at internal validity.

But, to have random assignment, you need two groups, and we don’t have that.

Discuss a study in which we have separate observations in the different groups.

Validity

Observed score = True score + error

Transformations

For today they only need to know:

Adding or subtracting a constant affects the mean but not the st. dev.

Multiplying or dividing by a constant multiplies the mean and the standard deviation by that constant, and multiplies the variance by the square of the constant.

To create a new variable with mean = 5 and s.d. = 7 (for example)

Start with a N(0,1) distribution (given by most computer software)

Tell them how to do this with SPSS

Create a new variable as Y = newStDev*X + NewMean

Here that would be Y = 7X + 5

Note to self:

Sufficiency: A more sufficient statistic makes use of all of the information in a sample

Efficiency : a more efficient estimator does as well with a smaller sample.

Resistance: Amount by what an estimator is resistant to outliers.

Unbiasedness: Degree to which the statistic centers on the population parameter.

Last revised: 09/04/01