R Code for Chapter 14

Two Independent Samples

The first block of code below is the code for drawing Figure 14.2. Much of this is straight-forward, but the use of "expression" in the plotting of legends is not so clear But it does draw a nice graphic. Expression" is used in different ways in this code. The first time we see it, it is used to write label the X axis as "X̄_i - X̄₂". This is included directly in the hist() function. We see something similar a few lines later where it is used to insert text at a specific location. Three lines later I created a variable "label1" that is a rather messy equation. "frac" means "fraction," and the two elements within the parentheses are the numerator and denominator. You can probably figure out the rest, although writing it correctly the first time is a challenge. Having created that variable, I insert in on the next line.

That plot is followed by code for running a matched-group t test by pooling the variances and then by not pooling the variances. If you run that you will see that there are only minor differences for this set of data. Notice that the calculation of the degrees of freedom agrees with the result given in the text.

Following the t test, which is really the focus of the analysis, I got carried away and plotted the data for the two groups separately. I then fit a density function to each set of data, and drew in the confidence interval for the mean difference. As I noted, it doesn't seem very sensible to plot that confidence interval in the same plot as the sample histograms, but I wanted you to see it somewhere, so I did.

###  Chapter 14

### Improved version Figure 14.2
pop1 <- rnorm(10000, 50, 10)
pop2 <- rnorm(10000, 55, 10)
nreps <- 10000
meandiff <- numeric(nreps)
for (i in 1:nreps) {
   samp1 <- sample(pop1, 25, replace = TRUE)
   samp2 <- sample(pop2, 25, replace = TRUE)
   meandiff[i] <- mean(samp1) - mean(samp2)
}

hist(meandiff, xlab = expression(bar(X)[1] - bar(X)[2]), main = "Distribution of 
Mean Differences", breaks = 25, freq = FALSE, xlim = c(-15, 5), yaxt = "n", ylab = "") 

x <- seq(-15, 5, .01)
y <- dnorm(x, mean(meandiff), sd(meandiff))
lines(x, y, col = "red" )
text(-2, .03, cex = 1.5, expression(mu[1] - mu[2]))

arrows(x0 = -2.5, y0 = .025, x1 = -5, y1 = 0)
lines(x = c(-5, -1), y = c(.05, .05))
label1 <- expression(sqrt(frac(sigma[1]^2, n[1]) + frac(sigma[1]^2, n[2])))
text(1, .1, cex = 1.5, label1)
arrows(x0 = 1, y0 = .085, x1 = -3, y1 = 0.05)

#################################################################
### Anorexia data from Table 14.1
anorexia <- read.table("http://www.uvm.edu/~dhowell/fundamentals9/DataFiles/Tab14-1.dat",
 header = TRUE)
attach(anorexia)
names(anorexia)
par(mfrow = c(1,1))    # Set up graphical display
# Compare Gain in two groups
result1 <- t.test(Gain ~ Trtment, var.equal = TRUE)  #Pool the variances
cat("Output for Pooled t \n")
print(result1)

result2 <- t.test(Gain ~ Trtment, var.equal = FALSE)  #Do not pool the variances
cat("Output for Non-pooled t \n")
print(result2)

# The following plots the data and fits density plots to each group. I think that 
# I did it more  for my entertainment than anything else.

hist(Gain[Trtment == 1], col = "red", density = 5,main = "Gains by Group", freq = FALSE,
  ylim = c(0,.08), xlab = "Weight Gain")
lines(density(Gain[Trtment == 1]))
hist(Gain[Trtment == 2], col = "blue", density = 5,main = "", freq = FALSE, add = TRUE,
  ylim = c(0,.08), xlab = "")
lines(density(Gain[Trtment == 2]))
lines(x = c( result1$conf.int[1],  result1$conf.int[2]), y = c(.065, .065))
lines(x <- c(result1$conf.int[1], result1$conf.int[1]), y = c(.063, .067))
lines(x <- c(result1$conf.int[2], result1$conf.int[2]), y = c(.063, .067))
legend(-14.8, .072, "CI for Mean Difference", bty = "n")
#  It is a bit odd to plot the CI for mean diff on the histogram, but I wanted
#  to include it somewhere.

The MBESS package by Ken Kelly is an extensive set of functions that do all sorts of useful things. I next use it to calculate confidence limits on the mean difference in the anorexia data. I have used the ci.smd function in two different ways to arrive at the same result. You can either enter the standardized mean difference, which would be (-0.45 - 7.26)/sqrt(58.907) = -1.005 ) or the resulting t value along with the sample sizes. The result is the same.

That analysis is followed by the analysis of the data from Adams, Wright, and Lohr (1996). This is very straight-forward. There is nothing new there.

### Staying with the anorexia data from Everitt.
# Load MBESS to get effect sizes and confidence limits
library(MBESS)
# help(package = MBESS)  # Gets you a list of all functions in MBESS
# ?ci.smd   
# How to get CI on standardized mean difference
# Variances are close to equal, so we will use that analysis
# The t is stored as result1$statistic
# You can either give the function the ncp, which is equal to t or you can give
# it the smd, which is d.

CI1 <- ci.smd(result1$statistic, n.1 = 26, n.2 = 17)  #This uses t as the ncp
cat("The confidence interval on the standard mean difference = \n")
print(CI1)
# Pooled sd <- sqrt(58.907)
d <- (mean(Gain[Trtment == 1]) - mean(Gain[Trtment == 2]))/sqrt(58.907)
cat("Standardized mean difference = ",d,"\n")
CI2 <- ci.smd(smd = d, n.1 = 26, n.2 = 17, conf.level = .95)

# Adams et al. data on homophobia
data <- read.table("http://www.uvm.edu/~dhowell/fundamentals9/DataFiles/Tab14-2.dat", header = TRUE)
names(data)
attach(data)
test <- t.test(Arousal ~ Homophobic, var.equal = TRUE)
library(MBESS)
ci.smd(ncp = -2.48, n.1 = 35, n.2 = 29)
ci.smd(smd = .62, n.1 = 35, n.2 = 29)

The last block of code deals with the study by Damisch et al. (2010) on lucky charms. In this example I simulated a different one of their studies in which participants were asked to putt a golf ball in the presence, or absence, of their lucky charm. I duplicated their means exactly, though the standard deviations may be off by a small amount. There is something new here where I generated a variable called "condition" having 15 ones followed by 15 twos. You can learn more about this command by typing ?rep. (I could have accomplished the same thing with something like
"condition <- c(1,1,1,1,1,....,1,1,2,2,2,2,2,....2,2,2).
Notice that I have worked through the calculations pretty much step-by-step. We have two kinds of confidence limits here. The first, which I print out, are confidence limits on the mean of each condition. The second, which the t test prints out, are confidence limits on the mean difference, produced by the t tests. Finally, I used the ci.smd function of MBESS to get confidence limits on the effect size. I do that by using the t calculated either with and without pooling the variances.



# Lucky Charm Study
# Damisch, L., Stoberock, B., & Mussweiler, T. (2010) Keep you fingers crossed! 
# How superstition improves performance. Psychological Science, 21, 1014 - 1020.

# Study 1
# dv = score sinking 10 putts from 1 meter."
# lucky = "So far this has turned out to be a lucky ball"
# not.lucky = "This is the ball that everyone has used so far."
n1 <- 15; n2 <- 13
mean1 <- 6.42; sd1 <- 1.88
mean2 <- 4.75; sd2 <- 2.15
set.seed(23534)
lucky <- rnorm(n1, mean1, sd1) - 0.44      # Getting mean exactly
not.lucky <- rnorm(n2, mean2, sd2) - 0.06
condition <- rep(c(1,2), c(15, 13))
score <- c(lucky, not.lucky)
means <- tapply(score, condition, mean)
barplot(height = means, width = c(50,50), space = .5, names.arg = c("Lucky", "Not Lucky"), 
ylab = "Mean Score", ylim = c(0,10), density = 5, col = "green", 
main = "Putting scores and 95% CIs")
t.lucky <- qt(.975, 14)      # Critical value of t
t.unlucky <- qt(.975, 12)
se.1 <- sd(lucky)/sqrt(n1)
se.2 <- sd(not.lucky)/sqrt(n2)
cat("Confidence limits for lucky condition \n")
CI1.lower = mean(lucky)-se.1*t.lucky
CI1.upper = mean(lucky)+se.1*t.lucky
cat(CI1.lower, "    ",CI1.upper)
CI2.lower = mean(not.lucky)-se.2*t.lucky
CI2.upper = mean(not.lucky)+se.2*t.lucky
cat("Confidence limits for not.lucky condition \n")
cat(CI2.lower, "    ", CI2.upper)
y1 <- c(CI1.lower, CI1.upper)
y2 <- c(CI2.lower, CI2.upper)
x1 <- c(50, 50)
x2 <- c(125, 125)
lines(x1,y1)
lines(x2, y2)

t.test(score ~ condition)
t.test(score ~ condition, var.equal = TRUE)

##### Add in effect size and CI on it--Try with or without pooled

library(MBESS)
st.mean.diff <- smd(lucky, not.lucky)
ci.smd(ncp = 2.831, n.1 = 15, n.2 = 13, conf.level = .95)
ci.smd(ncp = 2.88806, n.1 = 15, n.2 = 13, conf.level = .95)

dch: