bar bar

Weight, Shape, and Body Image

David C. Howell

Based on a paper by Geller, Johnston, and Madsen, 1997.

Eating disorders have become an important research problem for psychologists in the last 20 years, and a variety of theories abound. Recently, people have begun looking at eating disorders in the context of an individual's perception of themselves. Models such as these argue that a person's concerns about weight and shape play a central role in feelings of self-worth, which in turn play a role in eating disturbances.

Geller, Johnston, and Madsen (1997) developed an inventory called the Shape and Weight Based Self-esteem Inventory (SAWBS), and examined its psychometric properties. I do not have room here to go over all of their work, but the paper is a nice example of what needs to be done in the development of any new scale.

Geller et al. maintained that the SAWBS has an independent role in predicting eating-disordered behavior over and above the role of more traditional variables such as depression and general self-esteem. To investigate this hypothesis, they collected data on 84 female subjects on a number of different variables associated with eating disorders. These measures are outlined below.

The intent of this investigation was to examine the psychometric properties of the SAWBS. For our purposes the major question of interest will be the degree to which this measure can provide additional predictability of eating disordered behaviors, after we control for more traditional variables such as body mass, depression, and self-esteem. I have chosen to use this example precisely because it looks at the additional contribution of a variable. The general name for this approach is hierarchical regression, which, as you will quickly discover, can be thought of as just plain old linear regression with a difference. Hierarchical regression is not a new approach to regression, nor is it a separate technique.

The data are available in the file named Geller.sav.[For those reading this from a web page, I have made them available in Ascii form at Geller.dat.(The variable names are SAWBS, WtPercep, ShPercep, Hiq, EDIComp, RSES, BDI, SES, and SocDesir.) They closely mimic the data collected by Geller et al., both in terms of descriptive statistics and the intercorrelation matrix. The variables were generated with a normal random number generator, and were not truncated to integers, so fractional and negative values are included. This does not in any way alter the interpretation.

Relationships Among Variables

The intercorrelation matrix for these variables is given in Table 1, and has been pasted from SPSS printout. All correlations are based on a sample of N = .84.

SAWBS and Physical Characteristics

There are several correlations that are of interest here. In the first place, recall that the SAWBS deals with the extent to which self-worth is based on shape and weight, but is not a measure of actual shape and weight. As such, it should not be correlated with the Body Mass Index (BMI). In fact, that correlation is only .17, which is not significant.

SAWBS and Perceptions

It is apparent that the role that shape and weight play in determining an individual's feeling of self-esteem is, in fact, related to that person's feelings of satisfaction with their weight and shape, as can be seen by the significant correlations with WtPercep and ShPercep. However, it is important to keep in mind that the WtPercep measure is not linear in terms of desirability. For example, WtPercep runs from 1 = extremely overweight, to 4 = neutral, to 7 = extremely underweight. Presumably the optimal level would be a 4, with satisfaction falling off toward the extremes. This would have lead me to expect a curvilinear relationship between that variable and others, and yet the relationship turns out to be linear. The explanation for the linearity is apparent when you draw a scatterplot of this relationship. Here you see that the highest value of WtPercep is approximately 4.75, meaning that there are no data at the upper (underweight) end of the scale. That, in itself, is interesting. (The scatterplot (Figure 1) follows the intercorrelation matrix.)

Table 1

- -  Correlation Coefficients  - -


SAWBS     1.0000    -.3930    -.3855    .6149    .6148    -.3783 

         P= .      P= .000    P=.000  P= .000  P= .000   P= .000

WTPERCEP  -.3930    1.0000    .5592    -.5647   -.6059     .3841

         P= .000   P= .      P=.000   P= .000  P= .000   P= .000

SHPERCEP  -.3855     .5592   1.0000    -.5545    -.6666     .4173

         P= .000   P= .000  P=.       P= .000   P= .000   P= .000

HIQ        .6149    -.5647   -.5545    1.0000     .8573    -.6092

         P= .000   P= .000   P=.000   P= .      P= .000   P= .000

EDICOMP    .6148    -.6059   -.6666     .8573    1.0000    -.6812

         P= .000   P= .000   P=.000   P= .000   P= .      P= .000

RSES      -.3783     .3841    .4173    -.6092    -.6812    1.0000

         P= .000   P= .000   P=.000   P= .000   P= .000   P= .

BDI        .4237    -.4513   -.4732     .6561     .7112    -.7945

         P= .000   P= .000   P=.000   P= .000   P= .000   P= .000

BMI        .1656    -.6135   -.3238     .2188     .2373    -.0599

         P= .132   P= .000   P=.003   P= .046   P= .030   P= .588

SES       -.1338     .0040    .1444    -.1788    -.1794     .1762

         P= .225   P= .971   P=.190   P= .104   P= .102   P= .109

SOCDESIR  -.1259     .2294   -.0676    -.3564    -.1664     .2607

         P= .254   P= .036   P=.541   P= .001   P= .130   P= .017

Figure 1 Scatterplot of SAWBS as a function of WtPercep.

Relationship between SAWBS and Eating Disorder Measures

If the SAWBS measures the degree to which shape and weight play a role in determining the individual's feelings of self-esteem, then we would also expect that variable to be related to symptoms of eating disorders. That would suggest that the SAWBS measure should be correlated with both the Eating Disorders Inventory (EDI) and the Health Information Questionnaire (HIQ). From Table 1 we can see that these correlations are both .61, and are significant at p < .000. There is no way that we can establish a causal ordering on these relationships, but it is apparent that women who show many symptoms of eating disorders also report that their shape and weight play an important role in determining their feelings of self-esteem.

SAWBS and Self-esteem

Another interesting relationship is that between the SAWBS and the Rosenberg self-esteem Scale (RSES). The fact that someone's self-esteem is influenced by their satisfaction with their shape and weight, does not in itself suggest that there needs to be a correlation between those two variables. Remember that the SAWBS is not a measure of satisfaction, but a measure of influence. However, from Table 1 we see that this correlation is significant and negative (r = -.38).

Multiple Regression Analyses

Geller et al (1997) hypothesized that the SAWBS would be related to Eating Disorders measures over and above other more traditional measures, such as the BMI, depression, and actual self-esteem. There are at least two equivalent ways that we could look at this question, but the approach we will take here is often referred to as "hierarchical regression."

Hierarchical regression seems to be an "in prase" in the past few years, but all that it really means is "Do these independent variables add anything new when they are added to a mix of other independent variables?" We do this exactly the way it sounds-we first use the "controlled for" variables to predict the dependent variable, and then we add the extra variables to the model and see if prediction improves reliably.

We will start with the EDI composite as our index of eating disorder symptoms. (It is our dependent variable, and we would get similar results if we used HIQ instead.) First we will predict EDIcomp from BMI, BDI, and RSES, which are traditional predictors of eating disorders. The results of this regression are shown below. The printout is from a previous version of SPSS, but I have used it because it saves you having to download images. The results are exactly the same.

Table 2-The Reduced Model

* * * *   M U L T I P L E   R E G RE S S I O N   * * * *

Equation Number 1    Dependent Variable..  EDICOMP

Variable(s) Entered on Step Number

   1..    RSES

   2..    BMI

   3..    BDI

Multiple R           .75812

R Square             .57474

Adjusted R Square    .55879

Standard Error     11.62412

Analysis of Variance

                    DF      Sum ofSquares      Mean Square

Regression           3         14609.13054      4869.71018

Residual            80         10809.61946       135.12024

F =      36.03983       Signif F = .0000

------------------ Variables in the Equation ------------------

Variable              B        SE B      Beta         T  Sig T

BMI            1.019367     .413119   .180574     2.467  .0157

BDI            1.075868     .292390   .442643     3.680  .0004

RSES           -.680090     .256260  -.318671    -2.654  .0096

(Constant)    13.566366   14.694770                .923  .3587

There are several interesting things to see in this printout, but I will skip over most of them for the time being. One of the two most important things to notice is that these three variables taken together have a multiple correlation with EDIcomp of .76, and account for 57 percent of the variability in that measure. That is a substantial percentage of the variability for measures that are not directly measuring feelings of satisfaction with body image.

The second thing to note is that all three variables make a significant contribution to the prediction. That is interesting in light of the fact that BMI on its own is only weakly correlated with EDIcomp (r = .24, p = .03). I would have expected that this variable would have dropped to non-significance in the presence of two much more powerful predictors.

The next step in the hierarchical regression is to add SAWBS to the predictor list and see if it contributes anything additional to EDIcomp. Notice that we are not asking if SAWBS is related to EDIcomp. We are asking if it is related to EDIcomp when we control for BMI, BDI, and RSES. Another way of saying exactly the same thing is to ask if it adds significantly to the accounted for variation over and above what was accounted for by those three variables. (This is what we mean when we speak of "hierarchical" regression--there is a hierarchy of predictors.) When we run this larger regression we obtain:

Table 3-The Full Model 

Equation Number 1    Dependent Variable..  EDICOMP

Variable(s) Entered on Step Number

   1..    SAWBS

   2..    BMI

   3..    RSES

   4..    BDI

Multiple R           .81895

R Square             .67067

Adjusted R Square    .65400

Standard Error     10.29382

Analysis of Variance

                    DF      Sum of Squares      Mean Square


Regression           4         17047.69469      4261.92367

Residual            79          8371.05531       105.96273

F =      40.22097       Signif F = .0000

------------------------- Variables in the Equation--------------

Variable           B       SE B     Beta  Tolerance      VIF    T       

Sig T 

BDI          .805717    .264981   .331495  .350737     2.851    3.041   


BMI          .762357    .369742   .135046  .971748     1.029    2.062   


RSES        _.594573    .227632  -.278600  .366422     2.729   -2.612   


SAWBS        .130437    .027190   .346591  .798630     1.252    4.797   


(Constant)  1.388258  13.020970                                  .875   


I obtained this slightly expanded printout by choosing Collinearity Diagnostics from the Statistics button in the regression dialog box. Notice the value of R2 = .67, which is an increase of .10 from the R2 in the previous model.

We will refer to the two models that we have here as the full and reduced models. The full model has all of the predictors, while the reduced model has only some of the predictors from the full model. (The reduced model cannot have any predictors that are not included in the full model-it must be a proper subset of the full model.)

The added contribution of SAWBS, controlling for BMI, BDI, and RSES is .10. We can test the significance of this increment by a standard formula presented in the text. (There is an easier way in this particular case, and I'll come back to that in a minute.)

where f is the number of predictors in the full model, r is the number of predictors in the reduced model, R2f is the squared correlation from the full model, and R2r is the squared correlation from the reduced model. With an F of nearly 24, this increment (the increase in predictability due to adding SAWBS) is clearly significant.

It is apparent from these results that SAWBS has a significant and important contribution to make over and above the contributions of the other variables. We refer to this as the unique contribution of SAWBS.

Because our full model contains only one additional variable, we actually knew whether its additional contribution was significant without calculating the F statistic above. The t test for the significance of the slope of SAWBS is exactly the same test, because it tests whether the slope for SAWBS is significant in the presence of the other variables. (If I had not rounded so severely in the calculation of F, the square of t would be exactly equal to the F that I calculated.) This scheme only works when I add one predictor. If I had added two predictors, the F would be a test on whether those two predictors contributed significant new accountable variation, whereas the t tests on their slopes would deal with the variables individually.

These results tell us that even when we control for body mass, depression, and self-esteem, the role that shape and weight play in forming an individual's self-esteem play an important role in predicting eating disorders. This is potentially useful information.

Semi-partial Correlation

We know that SAWBS accounts for additional variation, but how much additional variation? Well, we know that also. For the reduced model, R2r = .57, and for the full model R2f = .67. This increase in the multiple correlation of .10 is the amount that SAWBS accounts for over and above the other variables. This is known as the squared semi-partial correlation for SAWBS. All squared semi-partial correlations are of this type-they are the increase in R2 when we add in just that variable. In the book I show how you can calculate these for each variable in the model from the individual t tests on the slope. In other words, you could look at the increment in R2 when each variable is added after the others, or you could just look at the t values for the full model and make the calculation directly. One important thing to note is that if you order the variables in the full model by the magnitude of their t statistics, you are essentially ordering them on the basis of the size of their squared semi-partial correlations, given the other variables in the set.


Earlier I noted that it was interesting that BMI was a significant predictor in the reduced model (and the full model) even though it had a relatively low correlation with SAWBS when looked at alone. Usually, variables that are only weakly (even if significantly) correlated with a dependent variable tend to drop out in more complex models because their role is taken over by the other variables in the model. So why didn't this happen to BMI? At least a partial answer can be found in the column headed Tolerance in the full model. The tolerance of a variable is a measure of how that variable correlates with the other independent variables in the model. The more highly it is correlated, the lower its tolerance will be (and the more able the other variables are to carry its weight). Actually,

tolerance = 1 - R2predictor. Other predictors

If you were to calculate the multiple correlation of BMI with BDI, RSES, and SAWBS, you would find
R2BMI.BDI, RSES, SAWBS = .02825. If we subtract this from 1.00 we get .97175, which is the tolerance given in Table 3.

Because BMI is nearly independent of the other variables in the model, they do not carry any of its information, and therefore they cannot take its place. Hence there is still a role for BMI to play in the regression, which is why it is not eliminated from the model. I must admit that this is the most extreme example of this situation that I have ever seen in real data.

The logical question for people to ask next would be "How small does the tolerance have to be before a variable really doesn't have anything new to add?" The answer is probably found by looking at the column of t. If a variable is significant, it is adding important information. However in practice, when the tolerance of a variable falls below about .10 (it shares 90% of its variation with other dependent variables), it is unlikely to remain significant in a model. Moreover, tolerances below .10 can make your model very unstable, leading to noticeably different regression equations from sample to sample. I would suggest that when you have variables with a tolerance of .10 or below, you should drop either that variable or others with which it correlates highly.)

Interaction Models-Moderation Effects

One of the more interesting problems in multiple regression (as well as one of the most frustrating) is the treatment of interactions. I say "frustrating" because it is often very difficult to find significant interactions, even when you have good reason to expect them. I say "interesting" because the use of interaction terms gives us considerable flexibility and explanatory power in modeling behavior.

We will start with the situation in which we want to predict eating disorders (as measured this time by HIQ) on the basis of self-esteem (RSES) and Shape and Weight based self-esteem (SAWBS). Remember that SAWBS is not a measure of self-esteem, it is just a measure of how important those variables are in controlling one's sense of self-esteem.

It seems logical (at least to me), that if someone already has a pretty good sense of self-esteem, the SAWBS measure might be largely irrelevant when it comes to predicting eating disorders. On the other hand, someone with a very poor self-esteem might show eating disorders if shape and weight are important to them, and show some other disorder if those variables are not important. If you think of this from an analysis of variance perspective, it sounds like I'm talking about an interaction. I predict a simple effect of SAWBS under one level of RSES, and no effect of SAWBS under another level of RSES. That's in fact exactly what I'm doing, but how do we get at an interaction in linear regression?

Remember that in Anova we wrote our interaction terms as AXB. That "X" in the middle wasn't just there to fill up some space-it actually represents a multiplicative effect. The same is true in linear regression; our interaction term will be the product of our two main effects. We will create a new variable which is the product of the two independent variables under consideration (SAWBS and RSES), and then use all three variables as predictors. If the interaction term is significant, we have our interaction.

Why can't it ever be simple?

Unfortunately, the world is not quite as simple as that last paragraph implies. If we do what I called for there, you should be able to see the problem. First, I will use SPSS (or whatever program I have at hand) to create a variable that is the product of SAWBS and RSES. Call it SAWXRSES. The run the regression of HIQ on SAWBS, RSES, and SAWXRSES, in the process asking for the intercorrelation matrix. The result is shown below:

Table 4 -Interactions

Variable(s) Entered on Step Number

   1..    SAWXRSES

   2..    RSES

   3..    SAWBS

Multiple R           .73810

R Square             .54479

Adjusted R Square    .52772

Standard Error      6.25376

Analysis of Variance

                    DF      Sum of Squares      Mean Square

Regression           3          3744.47265      1248.15755

Residual            80          3128.75735        39.10947

F =      31.91446       Signif F = .0000

------------------ Variables in the Equation ------------------

Variable              B        SE B      Beta         T  Sig T

RSES           -.536468     .139465  -.483411    -3.847  .0002

SAWBS           .062231     .057902   .317992     1.075  .2857

SAWXRSES    7.34027E-04     .001598   .125947      .459  .6472

(Constant)    31.110904    5.451364               5.707  .0000

                      - -  Correlation Coefficients  - -

             HIQ        SAWBS     RSES      SAWXRSES

HIQ          1.0000      .6149    -.6092      .4349

            P= .       P= .000    P=.000    P= .000

SAWBS         .6149     1.0000    -.3783      .9055

            P= .000    P= .       P=.000    P= .000

RSES         -.6092     -.3783    1.0000     -.0435)

            P= .000    P= .000    P=.       P= .695

SAWXRSES   .4349      .9055     -.0435    1.0000

            P= .000    P= .000    P=.695    P= .

First of all, notice that the multiple correlation is .73810. Write that down on the back of your hand so that you don't forget it. You'll want it in a minute. Notice also that the only variable that is significant is RSES. SAWBS isn't even close to significant, even though it does have a significant first-order correlation with HIQ. Finally, note that the interaction term isn't significant either, but that it's regression coefficient is 7.34027E-04. (To read this value, the E-04 says to move the decimal point four places to the left, giving .000734.) Well, that's a bummer! We have much less than we expected, as well as no interaction. What went wrong?

If you look at the intercorrelation matrix which follows the regression, you will see that the interaction term is very highly correlated with the SAWBS main effect. And you should remember that when you have highly correlated independent variables, the solution is both unsatisfactory and unstable. So we have to find a way to get rid of that high correlation.

One simple way to break up the high correlation is to use what are called "centered variables." A centered variable is created by subtracting the variable's mean from every observation. (Centered variables are really just deviation scores, to use a different language.) If you center both SAWBS and RSES, to create CENSAWBS and CENRSES, you will not change the simple correlation of either of those variables with each other or with the dependent variable-centering is just a linear transformation, such as changing feet to inches, or Fahrenheit to Celsus. You will, however, drastically reduce the correlation between either of those variables and their product (CENSAWRS). If you create those variables and run the regression, you get:

Table 5-Regression with centered variables


HIQ          1.0000     -.6092     .6149     -.1414

CENRSES      -.6092     1.0000    -.3783      .2102

CENSAWBS      .6149     -.3783    1.0000     -.1842

CENSAWRS     -.1414      .2102    -.1842     1.0000

                 *** MULTIPLE REGRESSION***

Variable(s) Entered on Step Number

   1..    CENSAWRS

   2..    CENSAWBS

   3..    CENRSES

Multiple R           .73810

R Square             .54479

Adjusted R Square    .52772

Standard Error      6.25376

Analysis of Variance

                    DF      Sum of Squares      Mean Square

Regression           3          3744.47265      1248.15755

Residual            80          3128.75735        39.10947

F =      31.91446       Signif F = .0000

------------------ Variables in the Equation ------------------

Variable              B        SE B      Beta         T  Sig T

CENSAWBS        .088656     .016055   .453021     5.522  .0000

CENRSES        -.494188     .091531  -.445313    -5.399  .0000

CENSAWRS    7.34027E-04     .001598   .035687      .459  .6472

(Constant)    16.904613     .719333              23.500  .0000

Notice that the two main effects still have the same correlation with the criterion (HIQ), but that the correlation between the product term and the dependent variable has been changed. This is to be expected. The most important thing to be seen here is that the interaction is no longer very highly correlated with SAWBS. That means we have a much better change of getting at the independent contributions of these variables. (In other language, we can say that its Tolerance is high.)

In the multiple regression in Table 5, notice that the overall multiple correlation coefficient is .73810. If that number doesn't seem familiar, look at the back of your hand. We have not explained any more or less of the variation, we have just shoved the pieces around.

Also notice the regression coefficient for the interaction effect (CENSAWRS = .000734027). Doesn't that look familiar? Again, we haven't disrupted the test of the relationship between the criterion and the interaction variable, we have just diddled with the relationship between the product term and the two main effects.

Finally, notice the regression equations for the other variables. Both CENSAWBS and CESRSES have significant regression coefficients, even though they didn't have them a few minutes ago. The point here is that these variables are highly correlated with the interaction term, which they create. Breaking up that correlation lets each variable speak for itself.

Now we can say that HIQ is linearly connected to both self-esteem and SAWBS, although there is no interaction between those two variables. You might wonder why I went through all of this when there was no interaction. Well, the simple answer is that I had to do something, and these were the data at hand. It is very easy to talk about interactions, and to explain why they should be there, but it is quite a different thing to find good ones in linear regression. There are at least two reasons for this:

In the first place, interactions may not be as common as we would like to think, and they can easily be hidden by strong main effects. In the second place, the experiments that we design in a laboratory, and the data that we collect in a natural setting, have tremendous differences in power. If you want to find interactions in natural settings, be prepared to invest a lot of effort. This point is very well made in a paper by McClelland & Judd, C. M. (1993).

Having written that last paragraph, I fear that I have left the reader wondering why anyone would ever look for interactions with regression if they are so hard to find. First, I may have overstated the case for how difficult it is to find them. Moreover, when we do find them they add some very nice explanatory mechanisms for theorists. I strongly advising looking for them, but you won't find them too often.

Moderating and Mediating Effects

This paper is already much too long, but I can't leave it without making at least passing reference to a critically important, and often cited, paper by Baron and Kenny (1986). They distinguish between a moderator effect, in which we look for a variable which partitions the data into separate subgroups with different relationships between the independent and dependent variables, and mediator effects, in which the independent variable works through some other variable to influence the dependent variable.

I like to think of a moderator effect as being similar to an interaction in Anova, where simple effects are different for different levels of one independent variable. In our example here, we have already looked at one sort of interaction. In a slightly different study we might find that there is a relationship between SAWBS and HIQ for females, but no relationship between those variables for males. In this case, SEX would be a moderator variable.

If, however, we believed that body image had no direct effect on depression, but that it lowered self-esteem, which in turn produced depression, then self-esteem would be a mediating variable, because it mediated the apparent relationship between body image and depression.

After reading the Baron and Kenny paper, you might try to look at the relationship between ShPercep, RSES, and depression. A case could be made that the relationship between ShPercep and BDI is mediation by RSES.


This document has gone on much too long, and I will end it here-at least for today. I may add more later.


Baron, R. M. & Kenny, D. A. (1986) The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182.

Geller, J., Johnston, C, & Madsen, K. (1997) The role of shape and weight in self-concept The shape and weight based self-esteem inventory. Cognitive Therapy and Research, 21, 5-24.

McClelland, G. H.,& Judd, C. M. (1993) Statistical difficulties in detecting statistical interactions and moderator effects. Psychological Bulletin, 114 376-390.

bar bar

Home Icon Return to Dave Howell's Statistical Home Page  

Send mail to: David.Howell@uvm.edu)

Last Revised: 7/13/98