where choose(n,k)=n!/(k!(n-k)!) n samples k successes
Pr(X=1)= p^k (1-p)^k for k= (0 or 1) n = 1 samples k = 0 or 1 successes
In an experiment, 10 seed are planted. 7 are observed to germinate and produce seedlings. i. What is an appropriate distribution to use to model this process? Why? Solution: Binomial distribution as it models numbers of successes in n trials.
Calculate P(x|p) for x=0,1,2,3,4 assuming p->0.5 and plot Probability Mass Function
Solution: Random sampling
x<-rbinom(n=10000,size=4,prob=0.5)
xProb<-hist(x,breaks=c(0,0.5,1.5,2.5,3.5,4.5),plot=T,probability=TRUE)
Solution: Exact calculations
plot(0:10,dbinom(x=0:10,size=10,prob=0.5),ylab="Prob",xlab="Number of Successes",ylim=c(0,0.4))
mtext(side=3,"Sampling Distribution",line=1)
Determine the likely value of p given 3 seedlings for 4 seeds Calculate P(x|p) for x=3 while varying p on range (0,1) and plot Likelihood function
plot(seq(from=0,to=1,length=100),dbinom(x=7,size=10,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
mtext(side=3,"Likelihood")
abline(v=0.70)
Question. Is a coin fair? You sample 18 coin tosses and observe 10 heads. Compare the likelihood function for p in the following two cases: A Bernoulli process that results in c(0,1,1,0,0,1,0,1,0,0,1,1,1,1,0,0,1,1) and a Binomial(k=10,n=18).
i. Plot the likelihood functions in the same figure.
likBern<-function(prob,k){
-sum(dBern(k,prob))
}
nllBern<-function(prob,s){
-sum(dbinom(x=s,n=1,prob,log=TRUE))
}
dataV<-c(0,1,1,0,0,1,0,1,0,0,1,1,1,1,0,0,1,1)
pVect<-seq(from=0,to=1,length=100)
dBern<-function(x,p){
return(log(p^x*(1-p)^(1-x)))
}
logLik<-sapply(pVect , likBern,k=dataV)
plot(pVect,-logLik,type="l")
sum(dataV)
## [1] 10
length(dataV)
## [1] 18
likBin<-function(prob,n,k){
-sum(dbinom(k,n,prob,log=TRUE))
}
pVect<-seq(from=0,to=1,length=100)
logLik<-sapply(pVect , likBin,n=18,k=10)
plot(pVect,-logLik,type="l")
lowerLik<-max(-logLik+35)/10
myRange<-range(pVect[(-logLik+35)>lowerLik])
abline(v=c(myRange))
Yes, the kernels look the same.
See the confidence limits in the plot above.
Now instead of observing only (k=10,n=18) for a single coin, you are interested in testing the population of coins produced at a US mint. You sample 10 coins, 10 times each. You observe the following number of heads: 6 3 7 3 6 4 4 5 3 8 Plot the likelihood function with the confidence limits as in Problem 2 iv. Is the mint producing fair coins?
likBin<-function(prob,n,k){
-sum(dbinom(k,n,prob,log=TRUE))
}
headVect<-c(6, 3, 7, 3, 6, 4, 4, 5, 3, 8)
likBin(0.5,10,headVect)
## [1] 19.44023
pVect<-seq(from=0,to=1,length=100)
logLik<-sapply(pVect , likBin,n=10,k=headVect)
plot(pVect,-logLik,type="l")
Yes, the factory seems to be producing fair coins. No evidence against this hypothesis, anyway.
Make a figure with 6 panels that show the likelihoods for the following samples (seedlings,seeds): (3,4),(6,8),(12,16),(24,32),(300,400), and (3000,4000). Hint: Use par(mfrow=c(3,2)) to set up a figure with six panels
parOld<-par(mfrow=c(3,2),oma=c(1,1,1,1),mar=c(1,1,1,1)+0.5)
plot(seq(from=0,to=1,length=100),dbinom(x=3,size=4,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=6,size=8,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=12,size=16,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=24,size=32,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=300,size=400,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
plot(seq(from=0,to=1,length=100),dbinom(x=3000,size=4000,prob=seq(from=0,to=1,length=100)),
typ="l",ylab="Likelihood",xlab="P")
par(parOld)
Let’s simulate seedling counts in forest quadrats. Let’s imagine that we have a field experiment where we are measuring seedlings in forest gaps vs closed canopy.
gap<-rpois(n=20,lambda = 10)
canopy<-rpois(n=20,lambda = 5)
hist(gap)
hist(canopy)
hist(gap)
nllPois<-function(parVec,data){
nllik<- -sum(dpois(x=data,lambda=parVec,log=TRUE))
# cat("nllik= ",nllik,sep=" ",fill=T);cat(" ",sep=" ",fill=T)
return(nllik)
}
# testing function
nllPois(1,canopy)
## [1] 110.1596
lambdaVect<-seq(from=0,to=20,length=100)
logLik<-sapply(lambdaVect , nllPois,data=canopy )
plot(lambdaVect,-logLik,type="l")
require(Rvmmin)
## Loading required package: Rvmmin
parVec<-1.
outPoisR<-Rvmmin(par=parVec,fn=nllPois,
lower=c(0.01),upper=c(Inf),
data=canopy)
outPoisR$value
## [1] 40.23092
outPoisR$par
## [1] 4.65
lambdaVect<-seq(from=0,to=20,length=100)
nlogLik<-sapply(lambdaVect , nllPois,data=gap )
plot(lambdaVect,-nlogLik,type="l")
require(Rvmmin)
parVec<-1.5
outPoisR<-Rvmmin(par=parVec,fn=nllPois,
lower=c(0.01),upper=c(Inf),
data=gap)
outPoisR$value
## [1] 47.73273
outPoisR$par
## [1] 11
SCRATCH
This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.
Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Cmd+Shift+Enter.
plot(cars)
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Cmd+Option+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Cmd+Shift+K to preview the HTML file).