Sunday, February 18, 2024

More about parameter estimation under bounded Brownian motion using bounded_bm

This morning I posted about parameter estimation under a model of Brownian motion with reflexive bounds developed by Boucher & Démery (2016) & now implemented in the phytools function bounded_bm.

One of the observations of that post was that the Brownian rate, \(\sigma^2\), could be estimated quite accurately – while the root state, \(x_0\), tended to be a bit all over the place.

This is kind of unsurprising, given that the equilibrium distribution (the expected distribution of the data after \(\infty\) time has elapsed) under bounded Brownian motion with bounds is just uniform on the interval of those bounds. The implication is that the closer we are to the equilibrium distribution (the longer our tree length for a given \(\sigma^2\) and finite bounds), the less we’re likely to able to say from tip data about the starting condition of the system: the root state.

I thought it would be interesting to visualize the likelihood surface along each parameter dimension. We might, for instance, expect the likelihood surface for \(\sigma^2\) to be quite peaked, while the likelihood surface for \(x_0\) is relatively flat.

To make this a heck of a lot easier for the user, I now have phytools::bounded_bm export the likelihood function as a component of the model object. (Lots of phytools functions, and model-fitting functions in geiger, already do this.)

library(phytools)
packageVersion("phytools")
## [1] '2.1.15'

Let’s start off with our visual simulator of bounded Brownian motion. I’ll use randomly chosen values of \(\sigma^2\) (on \([1.0, 2.0]\)) and \(x_0\) (on \([-1, 1]\)), the latter of which will be the same interval as my bounds.

sig2<-runif(n=1,min=1,max=2)
sig2
## [1] 1.928821
x0<-runif(n=1,min=-1,max=1)
x0
## [1] -0.8481281
tree<-pbtree(n=100,scale=1)
h<-max(nodeHeights(tree))
mt<-make.era.map(tree,seq(0,h,length.out=100))
st<-map.to.singleton(mt)
y<-fastBM(st,sig2=sig2,a=x0,internal=TRUE,bounds=c(-1,1))
par(mar=c(5.1,4.1,1.1,1.1))
phenogram(st,y,ftype="off",fsize=0.4,
  mar=c(5.1,4.1,1.1,1.1),lwd=5,las=1,cex.axis=0.8,
  spread.cost=c(1,0),ylim=c(-1.1,1.1))
grid()
phenogram(st,y,ftype="off",add=TRUE,lwd=3,
  color="white")
phenogram(st,y,ftype="off",add=TRUE,lwd=1,
  color="blue")
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
  lwd=6,col=palette()[2],lend=2)
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
  lwd=2,lty="dashed",lend=2)

plot of chunk unnamed-chunk-4

(Remember, this is just for fun. We can also simulated bounded Brownian motion on our original tree using phytools::fastBM as I demonstrated here.)

Our trait vector, y, contains the state values for both the tips & all along the internal edges of the tree. Let’s just pull out just the tip values, since this will be input for our analysis.

x<-y[tree$tip.label]
head(x,20)
##          t62          t63          t58          t88          t89          t44          t13           t3 
## -0.483493102 -0.694327704 -0.785763797  0.144816311  0.210487417 -0.338574022  0.659083449 -0.946025618 
##          t27          t90          t91          t18           t9          t78          t79          t34 
## -0.396395409 -0.687180530 -0.905217375 -0.915303105 -0.386274377 -0.935836530 -0.673887705 -0.453816871 
##          t29          t25          t26          t55 
## -0.850019353  0.004653449 -0.188978150  0.907332569

Great. Now we’re ready to fit the model.

bounded_fit<-bounded_bm(tree,x,lims=c(-1,1),lik.func="eigen",
  parallel=TRUE,root="mle")
bounded_fit
## Object of class "bounded_bm" based on
##  	a discretization with k = 200 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -1 , 1 ]
## 
## Fitted model parameters:
##   sigsq: 1.77218 
##      x0: -0.575 
## 
## Log-likelihood: -31.70313 
## 
## R thinks it has found the ML solution.

Our fitted model object in turn contains a likelihood function that, when given any values for \(\sigma^2\) and \(x_0\) will return the log-likelihood.

This is what I mean.

## ML estimates
bounded_fit$lik(sig2=bounded_fit$sigsq,x0=bounded_fit$x0,
  parallel=FALSE)
## [1] -31.70313
## generating values
bounded_fit$lik(sig2=sig2,x0=x0,parallel=FALSE)
## [1] -31.79579

(I set the optional argument parallel=FALSE because even though parallelization helped us out a lot during optimization, it is inefficient here with only one pass through the tree to compute the likelihood for a fixed set of parameter values. This is because setting up our socket cluster consumes more computational time than using it saves!)

Now let’s take the same approach to obtain an estimate of our likelihood surface. This time, we’ll just evaluate the likelihood for a set of evenly spaced values of \(\sigma^2\) and \(x_0\). Here, I’ll use the powerful R package future.apply to parallelize my sapply call across computer cores.

library(future.apply)
nn<-50 ## number of values to compute for surface
ncores<-min(c(parallel::detectCores()-1,nn))
plan(multisession,workers=ncores)
finite<-function(x) x[is.finite(x)]
par(mfrow=c(1,2),mar=c(5.1,4.1,1.1,1.1))
sig2<-seq(0.1,4,length.out=nn)
lnL_sig2<-future_sapply(sig2,bounded_fit$lik,x0=bounded_fit$x0,
  parallel=FALSE,future.seed=TRUE)
## Warning in log(pp): NaNs produced
plot(sig2,lnL_sig2,type="l",bty="n",
  xlab=expression(sigma^2),ylab="log(L)",las=1,cex.axis=0.8)
grid()
clip(0,4,min(finite(lnL_sig2)),max(lnL_sig2))
abline(v=bounded_fit$sigsq,lty="dotted")
text(x=bounded_fit$sigsq,y=(logLik(bounded_fit)+min(finite(lnL_sig2)))/2,
  expression(paste("MLE  ",sigma^2)),srt=90,pos=4,cex=0.8)
x0<-seq(-1+1e-6,1-1e-6,length.out=nn)
lnL_x0<-future_sapply(x0,bounded_fit$lik,sig2=bounded_fit$sigsq,
  parallel=FALSE,future.seed=TRUE)
plot(x0,lnL_x0,type="l",bty="n",
  xlab=expression(x[0]),ylab="log(L)",las=1,cex.axis=0.8)
grid()
clip(-1,1,min(lnL_x0),max(lnL_x0))
abline(v=bounded_fit$x0,lty="dotted")
text(x=bounded_fit$x0,y=(logLik(bounded_fit)+min(finite(lnL_x0)))/2,
  expression(paste("MLE  ",x[0])),srt=90,pos=4,cex=0.8)

plot of chunk unnamed-chunk-9

plan(sequential)

It may not be overly apparent from these graphs – but if we look at the vertical axis of our plot, we should see that the scale is much narrower for our \(x_0\) than \(\sigma^2\), indicating much greater curvature (i.e., information about the parameters) in the latter than the former.

This will be borne out if we compute the matrix of second derivatives of our likelihood surface around the MLE solution, which we can do using numDeriv::hessian as follows.

foo<-function(par) bounded_fit$lik(par[1],par[2],parallel=FALSE)
H<-numDeriv::hessian(foo,c(bounded_fit$sigsq,bounded_fit$x0))
H
##            [,1]       [,2]
## [1,] -8.4466111 -0.5417145
## [2,] -0.5417145 -1.4801700

Larger negative values indicate steeper curvature around the MLE, which we see here for \(\sigma^2\) compared to \(x_0\). (Or, conversely, the negative inverse of H will – according to the theory of likelihoods – asymptotically approximate the variance-covariance matrix of our parameters.)

solve(-H)
##             [,1]        [,2]
## [1,]  0.12123631 -0.04437022
## [2,] -0.04437022  0.69183672

This is all I’m saying about that for now!

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.