This morning I posted about parameter estimation under a model of Brownian motion with reflexive bounds developed by Boucher & Démery (2016) & now implemented in the phytools function bounded_bm
.
One of the observations of that post was that the Brownian rate, \(\sigma^2\), could be estimated quite accurately – while the root state, \(x_0\), tended to be a bit all over the place.
This is kind of unsurprising, given that the equilibrium distribution (the expected distribution of the data after \(\infty\) time has elapsed) under bounded Brownian motion with bounds is just uniform on the interval of those bounds. The implication is that the closer we are to the equilibrium distribution (the longer our tree length for a given \(\sigma^2\) and finite bounds), the less we’re likely to able to say from tip data about the starting condition of the system: the root state.
I thought it would be interesting to visualize the likelihood surface along each parameter dimension. We might, for instance, expect the likelihood surface for \(\sigma^2\) to be quite peaked, while the likelihood surface for \(x_0\) is relatively flat.
To make this a heck of a lot easier for the user, I now have phytools::bounded_bm
export the likelihood function as a component of the model object. (Lots of phytools functions, and model-fitting functions in geiger, already do this.)
library(phytools)
packageVersion("phytools")
## [1] '2.1.15'
Let’s start off with our visual simulator of bounded Brownian motion. I’ll use randomly chosen values of \(\sigma^2\) (on \([1.0, 2.0]\)) and \(x_0\) (on \([-1, 1]\)), the latter of which will be the same interval as my bounds.
sig2<-runif(n=1,min=1,max=2)
sig2
## [1] 1.928821
x0<-runif(n=1,min=-1,max=1)
x0
## [1] -0.8481281
tree<-pbtree(n=100,scale=1)
h<-max(nodeHeights(tree))
mt<-make.era.map(tree,seq(0,h,length.out=100))
st<-map.to.singleton(mt)
y<-fastBM(st,sig2=sig2,a=x0,internal=TRUE,bounds=c(-1,1))
par(mar=c(5.1,4.1,1.1,1.1))
phenogram(st,y,ftype="off",fsize=0.4,
mar=c(5.1,4.1,1.1,1.1),lwd=5,las=1,cex.axis=0.8,
spread.cost=c(1,0),ylim=c(-1.1,1.1))
grid()
phenogram(st,y,ftype="off",add=TRUE,lwd=3,
color="white")
phenogram(st,y,ftype="off",add=TRUE,lwd=1,
color="blue")
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
lwd=6,col=palette()[2],lend=2)
segments(x0=c(0,0),y0=c(-1,1),x1=c(h,h),y1=c(-1,1),
lwd=2,lty="dashed",lend=2)
(Remember, this is just for fun. We can also simulated bounded Brownian motion on our original tree using phytools::fastBM
as I demonstrated here.)
Our trait vector, y
, contains the state values for both the tips & all along the internal edges of the tree. Let’s just pull out just the tip values, since this will be input for our analysis.
x<-y[tree$tip.label]
head(x,20)
## t62 t63 t58 t88 t89 t44 t13 t3
## -0.483493102 -0.694327704 -0.785763797 0.144816311 0.210487417 -0.338574022 0.659083449 -0.946025618
## t27 t90 t91 t18 t9 t78 t79 t34
## -0.396395409 -0.687180530 -0.905217375 -0.915303105 -0.386274377 -0.935836530 -0.673887705 -0.453816871
## t29 t25 t26 t55
## -0.850019353 0.004653449 -0.188978150 0.907332569
Great. Now we’re ready to fit the model.
bounded_fit<-bounded_bm(tree,x,lims=c(-1,1),lik.func="eigen",
parallel=TRUE,root="mle")
bounded_fit
## Object of class "bounded_bm" based on
## a discretization with k = 200 levels.
##
## Unwrapped (i.e., bounded) model
##
## Set or estimated bounds: [ -1 , 1 ]
##
## Fitted model parameters:
## sigsq: 1.77218
## x0: -0.575
##
## Log-likelihood: -31.70313
##
## R thinks it has found the ML solution.
Our fitted model object in turn contains a likelihood function that, when given any values for \(\sigma^2\) and \(x_0\) will return the log-likelihood.
This is what I mean.
## ML estimates
bounded_fit$lik(sig2=bounded_fit$sigsq,x0=bounded_fit$x0,
parallel=FALSE)
## [1] -31.70313
## generating values
bounded_fit$lik(sig2=sig2,x0=x0,parallel=FALSE)
## [1] -31.79579
(I set the optional argument parallel=FALSE
because even though parallelization helped us out a lot during optimization, it is inefficient here with only one pass through the tree to compute the likelihood for a fixed set of parameter values. This is because setting up our socket cluster consumes more computational time than using it saves!)
Now let’s take the same approach to obtain an estimate of our likelihood surface. This time, we’ll just evaluate the likelihood for a set of evenly spaced values of \(\sigma^2\) and \(x_0\). Here, I’ll use the powerful R package future.apply to parallelize my sapply
call across computer cores.
library(future.apply)
nn<-50 ## number of values to compute for surface
ncores<-min(c(parallel::detectCores()-1,nn))
plan(multisession,workers=ncores)
finite<-function(x) x[is.finite(x)]
par(mfrow=c(1,2),mar=c(5.1,4.1,1.1,1.1))
sig2<-seq(0.1,4,length.out=nn)
lnL_sig2<-future_sapply(sig2,bounded_fit$lik,x0=bounded_fit$x0,
parallel=FALSE,future.seed=TRUE)
## Warning in log(pp): NaNs produced
plot(sig2,lnL_sig2,type="l",bty="n",
xlab=expression(sigma^2),ylab="log(L)",las=1,cex.axis=0.8)
grid()
clip(0,4,min(finite(lnL_sig2)),max(lnL_sig2))
abline(v=bounded_fit$sigsq,lty="dotted")
text(x=bounded_fit$sigsq,y=(logLik(bounded_fit)+min(finite(lnL_sig2)))/2,
expression(paste("MLE ",sigma^2)),srt=90,pos=4,cex=0.8)
x0<-seq(-1+1e-6,1-1e-6,length.out=nn)
lnL_x0<-future_sapply(x0,bounded_fit$lik,sig2=bounded_fit$sigsq,
parallel=FALSE,future.seed=TRUE)
plot(x0,lnL_x0,type="l",bty="n",
xlab=expression(x[0]),ylab="log(L)",las=1,cex.axis=0.8)
grid()
clip(-1,1,min(lnL_x0),max(lnL_x0))
abline(v=bounded_fit$x0,lty="dotted")
text(x=bounded_fit$x0,y=(logLik(bounded_fit)+min(finite(lnL_x0)))/2,
expression(paste("MLE ",x[0])),srt=90,pos=4,cex=0.8)
plan(sequential)
It may not be overly apparent from these graphs – but if we look at the vertical axis of our plot, we should see that the scale is much narrower for our \(x_0\) than \(\sigma^2\), indicating much greater curvature (i.e., information about the parameters) in the latter than the former.
This will be borne out if we compute the matrix of second derivatives of our likelihood surface around the MLE solution, which we can do using numDeriv::hessian
as follows.
foo<-function(par) bounded_fit$lik(par[1],par[2],parallel=FALSE)
H<-numDeriv::hessian(foo,c(bounded_fit$sigsq,bounded_fit$x0))
H
## [,1] [,2]
## [1,] -8.4466111 -0.5417145
## [2,] -0.5417145 -1.4801700
Larger negative values indicate steeper curvature around the MLE, which we see here for \(\sigma^2\) compared to \(x_0\). (Or, conversely, the negative inverse of H
will – according to the theory of likelihoods – asymptotically approximate the variance-covariance matrix of our parameters.)
solve(-H)
## [,1] [,2]
## [1,] 0.12123631 -0.04437022
## [2,] -0.04437022 0.69183672
This is all I’m saying about that for now!
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.