Maximum Likelihood

Binomial distribution: Pr(X=k)=choose(n, k) p^k (1-p)^(n-k)

where choose(n,k)=n!/(k!(n-k)!) n samples k successes

Bernoulli distribution

Pr(X=1)= p^k (1-p)^k for k= (0 or 1) n = 1 samples k = 0 or 1 successes

Problem 1: Sampling Distribution vs. Likelihood function

In an experiment, 10 seed are planted. 7 are observed to germinate and produce seedlings. i. What is an appropriate distribution to use to model this process? Why? Solution: Binomial distribution as it models numbers of successes in n trials.

ii. Plot the Sampling distribution:

Calculate P(x|p) for x=0,1,2,3,4 assuming p->0.5 and plot Probability Mass Function

Solution: Random sampling

x<-rbinom(n=10000,size=4,prob=0.5)
xProb<-hist(x,breaks=c(0,0.5,1.5,2.5,3.5,4.5),plot=T,probability=TRUE)

Solution: Exact calculations

plot(0:10,dbinom(x=0:10,size=10,prob=0.5),ylab="Prob",xlab="Number of Successes",ylim=c(0,0.4))
mtext(side=3,"Sampling Distribution",line=1)

iii. Plot the Likelihood function given 7 of the 10 seeds germinated. Estimate the mle for p?

Determine the likely value of p given 3 seedlings for 4 seeds Calculate P(x|p) for x=3 while varying p on range (0,1) and plot Likelihood function

plot(seq(from=0,to=1,length=100),dbinom(x=7,size=10,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
mtext(side=3,"Likelihood")
abline(v=0.70)

Problem 2

Question. Is a coin fair? You sample 18 coin tosses and observe 10 heads. Compare the likelihood function for p in the following two cases: A Bernoulli process that results in c(0,1,1,0,0,1,0,1,0,0,1,1,1,1,0,0,1,1) and a Binomial(k=10,n=18).
i. Plot the likelihood functions in the same figure.

likBern<-function(prob,k){
  -sum(dBern(k,prob))
}

nllBern<-function(prob,s){
  -sum(dbinom(x=s,n=1,prob,log=TRUE))
}

dataV<-c(0,1,1,0,0,1,0,1,0,0,1,1,1,1,0,0,1,1)
pVect<-seq(from=0,to=1,length=100)

dBern<-function(x,p){
  return(log(p^x*(1-p)^(1-x)))
}

logLik<-sapply(pVect , likBern,k=dataV)
plot(pVect,-logLik,type="l")

sum(dataV)

## [1] 10

length(dataV)

## [1] 18

likBin<-function(prob,n,k){
  -sum(dbinom(k,n,prob,log=TRUE))
}

pVect<-seq(from=0,to=1,length=100)
logLik<-sapply(pVect , likBin,n=18,k=10)
plot(pVect,-logLik,type="l")

lowerLik<-max(-logLik+35)/10
myRange<-range(pVect[(-logLik+35)>lowerLik])
abline(v=c(myRange))

Are the likelihood functions the same? Why or why not?

Yes, the kernels look the same.

Plot confidence limits on the likelihood plot.

See the confidence limits in the plot above.

Problem 3

Now instead of observing only (k=10,n=18) for a single coin, you are interested in testing the population of coins produced at a US mint. You sample 10 coins, 10 times each. You observe the following number of heads: 6 3 7 3 6 4 4 5 3 8 Plot the likelihood function with the confidence limits as in Problem 2 iv. Is the mint producing fair coins?

likBin<-function(prob,n,k){
  -sum(dbinom(k,n,prob,log=TRUE))
}

headVect<-c(6, 3, 7, 3, 6, 4, 4, 5, 3, 8)
likBin(0.5,10,headVect)

## [1] 19.44023

pVect<-seq(from=0,to=1,length=100)
logLik<-sapply(pVect , likBin,n=10,k=headVect)
plot(pVect,-logLik,type="l")

Yes, the factory seems to be producing fair coins. No evidence against this hypothesis, anyway.

Problem 4

Make a figure with 6 panels that show the likelihoods for the following samples (seedlings,seeds): (3,4),(6,8),(12,16),(24,32),(300,400), and (3000,4000). Hint: Use par(mfrow=c(3,2)) to set up a figure with six panels

parOld<-par(mfrow=c(3,2),oma=c(1,1,1,1),mar=c(1,1,1,1)+0.5)
plot(seq(from=0,to=1,length=100),dbinom(x=3,size=4,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=6,size=8,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=12,size=16,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=24,size=32,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=300,size=400,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=3000,size=4000,prob=seq(from=0,to=1,length=100)),
     typ="l",ylab="Likelihood",xlab="P")

par(parOld)

Problem 5

Let’s simulate seedling counts in forest quadrats. Let’s imagine that we have a field experiment where we are measuring seedlings in forest gaps vs closed canopy.

Simulate data that represent 20 quadrats sampled from a forest gap using a lamba (Poisson parameter) of 10 and 20 quadrats from beneath forest canopy with a lamba of 5.

gap<-rpois(n=20,lambda = 10)
canopy<-rpois(n=20,lambda = 5)
hist(gap)

hist(canopy)

hist(gap)

Plot the likelihood function for the lamba parameters for the gap and forest plots. Estimate the mle for the gap and forest plots.

nllPois<-function(parVec,data){
  nllik<- -sum(dpois(x=data,lambda=parVec,log=TRUE))
  # cat("nllik= ",nllik,sep=" ",fill=T);cat(" ",sep=" ",fill=T)
  return(nllik)
}

# testing function
nllPois(1,canopy)

## [1] 110.1596

Plotting log likelihood for CANOPY seedling counts

lambdaVect<-seq(from=0,to=20,length=100)
logLik<-sapply(lambdaVect , nllPois,data=canopy )
plot(lambdaVect,-logLik,type="l")

require(Rvmmin)

## Loading required package: Rvmmin

parVec<-1.
outPoisR<-Rvmmin(par=parVec,fn=nllPois,
               lower=c(0.01),upper=c(Inf),
               data=canopy)
outPoisR$value

## [1] 40.23092

outPoisR$par

## [1] 4.65

Plotting log likelihood for GAP seedling counts

lambdaVect<-seq(from=0,to=20,length=100)
nlogLik<-sapply(lambdaVect , nllPois,data=gap )
plot(lambdaVect,-nlogLik,type="l")

require(Rvmmin)
parVec<-1.5
outPoisR<-Rvmmin(par=parVec,fn=nllPois,
               lower=c(0.01),upper=c(Inf),
               data=gap)
outPoisR$value

## [1] 47.73273

outPoisR$par

## [1] 11

SCRATCH

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.

plot(cars)

Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.

When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).