The major assumption behind traditional parametric procedures--more fundamental than normality and homogeneity of variance--is the assumption that we have randomly sampled from some population (usually a normal one). Of course virtually no study you are likely to run will employ true random sampling, but leave that aside for the moment. To see why this assumption is so critical, consider an example in which we draw two samples, calculate the sample means and variances, and use those as estimates of the corresponding population parameters. For example, we might draw a random sample of anorexic girls (potentially) given one treatment, and a random sample of anorexic girls given another treatment, and use our statistical test to draw inferences about the parameters of the corresponding populations from which the girls were randomly sampled. We would probably like to show that our favorite treatment leads to greater weight gain than the competing treatment, and thus the mean of the population of all girls given our favorite treatment is greater than the mean of the other population. But statistically, it makes no sense to say that the sample means are estimates of the corresponding population parameters unless the samples are drawn randomly from that (those) populations(s). (Using the 12 middle school girls in your third period living-arts class is not going to give you a believable estimate of U. S. (let alone world) weights of pre-adolescent girls.) That is why the assumption of random sampling is so critical. In the extreme, if we don't sample randomly, we can't say anything meaningful about the parameters, so why bother? That is part of the argument put forth by the resampling camp.

Of course, those of us who have been involved in statistics for any length of time recognize this assumption, but we rarely give it much thought. We assume that our sample, though not really random, is a pretty good example of what we would have if we had the resources to draw truly random samples, and we go merrily on our way, confident in the belief that the samples we actually have are "good enough" for the purpose. That is where the parametric folks and the resampling folks have a parting of the ways.

The parametric people are not necessarily wrong in
thinking that
on occasion nonrandom sampling is good enough. If we are
measuring
something that would not be expected to vary
*systematically*
among participants, such as the effect of specific stimulus
variations on visual illusions, then a convenience sample
may give
acceptable results. But keep in mind that any inferences
we draw are
not statistical inferences, but logical inferences.
Without random
sampling we cannot make a statistical inference about the
mean of a
larger population. But on nonstatistical grounds it may
make good
sense to assume that we have learned something about how
people in
general process visual information. But using that kind of
argument
to brush aside some of the criticisms of parametric tests
doesn't
diminish the fact that the resampling approach
legitimately differs
in its underlying philosophy.

The resampling approach, and for now I mean the randomization test approach, and not bootstrapping, really looks at the problem differently. In the first place, people in that area don't give a "population" the centrality that we are used to assigning to it in parametric statistics. They don't speak as if they sit around fondly imagining those lovely bell-shaped distributions with numbers streaming out of them, that we often see in introductory textbooks. In fact, they hardly appear to think about populations at all. And they certainly don't think about drawing random samples from those imaginary populations. Those people are as qualified as you could wish as statisticians, but they don't worry too much about estimating parameters, for which you really do need random samples. They just want to know the likelihood of the sample data falling as they did if treatments were equally effective. And for that, they don't absolutely need to think of populations.

In the history of statistics, the procedures with which we are most familiar were developed on the assumption of random sampling. And they were developed with the expectation that we are trying to estimate the corresponding population mean, variance, or whatever. This idea of "estimation" is central to the whole history of traditional statistics--we estimate population means so that we can (hopefully) conclude that they are different and that the treatments have different effects.

But that is not what the randomization test folks are
trying to
do. They start with the assumption that samples are
probably not
drawn randomly, and assume that we have no valid basis (or
need) for
estimating population parameters. This, I think, is the
best reason
to think of these procedures as *nonparametric*
procedures,
though there are other reasons to call them that. But if
we can't
estimate population parameters, and thus have no
legitimate basis for
retaining or rejecting a null hypothesis about those
parameters, what
basis do we have for constructing any statistical test. It
turns out
that we have legitimate alternative ways for testing our
hypothesis,
though I'm not sure that we should even be calling it a null
hypothesis.

This difference over the role of random sampling is a
critical
difference between the two approaches. But that is not
all. The
resampling people, in particular, care greatly about
*random
assignment*. The whole approach is based on the idea of
random
assignment of cases to conditions. That will appear to
create
problems later on, but take it as part of the underlying
rationale.
Both groups certainly think that random assignment to
conditions is
important, primarily because it rules out alternative
explanations
for any differences that are found. But the resampling
camp goes
further, and makes it the center point of their analysis.
To put it
very succinctly, a randomization test works on the logical
principle
that if cases were randomly assigned to treatments, and if
treatments
have absolutely no effect on scores, then a particular
score is just
as likely to have appeared under one condition than under
any other.
Notice that the principle of random assignment tells us
that if the
null hypothesis is true, we could validly shuffle the data
and expect
to get essentially the same results. This is why random
assignment is
fundamental to the statistical procedure employed.

dch:

David C. Howell

University of Vermont

David.Howell@uvm.edu