As part of other related work, I put together a bit of code to explore parameter estimation under the bounded Brownian motion (BBM) model of Boucher & DÃ©mery (2016). This might be of interest to readers of this blog, so I thought I’d try posting it here.

TLDR: we can estimate \(\sigma^2\) pretty closely using `bounded_bm`

, but estimating \(x_0\) is virtually useless for any condition in which the bounds are reached multiple times during the simulation.

First, let’s start off by loading *phytools*.

```
library(phytools)
packageVersion("phytools")
```

```
## [1] '2.1.14'
```

Now, before I proceed, in my simulations I’ll use multiple values of \(\sigma^2\) and \(x_0\) for a fixed total tree length, but they’ll have a mean value of \(\sigma^2 = 1.0\) and \(x_0 = 0.0\), with constant bounds \([-1, 1]\).

To see that this will typically result in the bounde being reached multiple times, let’s use my illustrative simulation code from an earlier post. This shows us that simulation under these parameters is very bounded indeed.

```
tree<-pbtree(n=100,scale=1)
h<-max(nodeHeights(tree))
mt<-make.era.map(tree,seq(0,h,length.out=100))
st<-map.to.singleton(mt)
y<-fastBM(st,sig2=1,a=0,internal=TRUE,bounds=c(-1,1))
par(mar=c(5.1,4.1,1.1,1.1))
phenogram(st,y,ftype="off",fsize=0.4,
mar=c(5.1,4.1,1.1,1.1),lwd=5,las=1,cex.axis=0.8,
spread.cost=c(1,0),ylim=c(-1.1,1.1))
grid()
phenogram(st,y,ftype="off",add=TRUE,lwd=3,
color="white")
phenogram(st,y,ftype="off",add=TRUE,lwd=1,
color="blue")
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
lwd=6,col=palette()[2],lend=2)
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
lwd=2,lty="dashed",lend=2)
```

Now let’s proceed to our study.

To begin, I’ll set the number of simulations (100) and create a matrix for our results.

```
nsim<-100
results<-matrix(NA,nsim,7,dimnames=list(1:nsim,c("true_sig2",
"true_x0","sig2","x0","logL","bm_sig2","bm_x0")))
```

Next, I’ll use a `for`

loop to iterate `nsim`

simulations and model-fits.

```
for(i in 1:nsim){
## simulate
tree<-pbtree(n=100,scale=1)
x<-fastBM(tree,sig2=sig2<-runif(n=1,0,2),
a=x0<-runif(n=1,-1,1),
bounds=c(-1,1))
## fit using phytools::bounded_bm
bounded.fit<-bounded_bm(tree,x,lims=c(-1,1),
root="mle",lik.func="eigen",parallel=TRUE,levs=200)
## fit standard BM
bm.fit<-geiger::fitContinuous(tree,x)
## store result
results[i,]<-c(sig2,x0,bounded.fit$sigsq,
bounded.fit$x0,logLik(bounded.fit),
bm.fit$opt$sigsq,bm.fit$opt$z0)
}
```

Finally, let’s graph our simulated & estimated \(\sigma^2\), along with our simulated and estimated \(x_0\).

```
par(mfrow=c(1,2),mar=c(5.1,5.1,2.1,2.1))
plot(results[,"true_sig2"],results[,"sig2"],
xlab=expression(paste("generating ",sigma^2)),
ylab=expression(paste("estimated ",sigma^2," (bounded_bm)")),
bty="n",pch=21,bg="grey",las=1,cex.axis=0.8,cex=1.2)
grid()
legend("topleft","1:1 line",lwd=1,bty="n")
lines(range(par()$usr),range(par()$usr))
plot(results[,"true_x0"],results[,"x0"],
xlab=expression(paste("generating ",x[0])),
ylab=expression(paste("estimated ",x[0]," (bounded_bm)")),
bty="n",pch=21,bg="grey",las=1,cex.axis=0.8,cex=1.2)
grid()
lines(range(par()$usr),range(par()$usr))
```

This shows that we can estimate the parameters of the process – *particularly* \(\sigma^2\). (TBF, it’s fairly unsurprising that \(x_0\) is so much more difficult to get, since the equilibrium distribution under BBM is uniform on the interval defined by the bounds!)

Why don’t we also *compare* this to estimates of \(\sigma^2\) and \(x_0\) from `geiger::fitContinuous`

assuming *unbounded* Brownian motion, even though this wasn’t the generating process.

```
par(mfrow=c(1,2),mar=c(5.1,5.1,2.1,2.1))
plot(results[,"true_sig2"],results[,"bm_sig2"],
xlab=expression(paste("generating ",sigma^2)),
ylab=expression(paste("estimated ",sigma^2," (fitContinuous)")),
bty="n",pch=21,bg="grey",las=1,cex.axis=0.8,cex=1.2)
grid()
legend("topleft","1:1 line",lwd=1,bty="n")
lines(range(par()$usr),range(par()$usr))
plot(results[,"true_x0"],results[,"bm_x0"],
xlab=expression(paste("generating ",x[0])),
ylab=expression(paste("estimated ",x[0]," (fitContinuous)")),
bty="n",pch=21,bg="grey",las=1,cex.axis=0.8,cex=1.2)
grid()
lines(range(par()$usr),range(par()$usr))
```

We see that \(\sigma^2\) is systematically *underestimated* by `fitContinuous`

– increasingly for larger \(\sigma^2\), but *unbiasedly* estimated by `bounded_bm`

.

Cool!

That’s all for now, folks.

## No comments:

## Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.