Friday, June 5, 2026

Ancestral states under the discretized diffusion approximation using bounded_bm in phytools

As many followers & readers of this blog will undoubtedly know, I’ve been thinking & writing endlessly about what we refer to as the “discretized diffusion approximation” for computing the likelihood in continuous & discrete character models (first instroduced to phylogenetic comparative biology by Boucher & Démery 2016) for well over two years. We recently published a pre-print that I encourage anyone interested in this topic to check out!

In this post, I just wanted to briefly explore ancestral state estimation using this approach which I’ve implemented in the phytools ancr generic method.

First, let’s just see how it works in general. To do that, I’m going to simulate a tree & some data via unbounded Brownian motion under which our MLEs for ancestral states would normally be very easy to obtain (e.g., using phytools::fastAnc).

## load phytools
library(phytools)
## simulate a tree
phy<-pbtree(n=26,scale=1,tip.label=LETTERS)
phy
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##   A, B, C, D, E, F, ...
## 
## Rooted; includes branch length(s).
## simulate some trait data
x<-fastBM(phy)
x
##           A           B           C           D           E           F           G 
## -0.52238323 -1.08476410 -0.73893741 -0.75336367 -0.94397490 -0.67163255 -1.21387274 
##           H           I           J           K           L           M           N 
## -0.65653756 -1.16240551 -0.87838336 -0.93716934 -0.80657269 -0.96539003 -1.09064863 
##           O           P           Q           R           S           T           U 
## -1.80940444 -0.99874761 -2.07552774 -1.83396425 -0.77599997 -2.04509690 -0.67626465 
##           V           W           X           Y           Z 
## -0.09662133  0.57076545  0.47261265  0.26072599 -0.03490109

Next, let’s fit an unbounded Brownian model using the discrete approximation. This can be done in phytools with the function bounded_bm just by allowing the limits or bounds of the trait to fall well outside the observed trait distribution. (This is technically a “bounded” model, but since almost no probability density hits the bounds it is effectively unbounded BM.)

## fit unbounded model
bm<-bounded_bm(phy,x,root="mle",levs=400)
bm
## Object of class "bounded_bm" based on
##    a discretization with k = 400 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -3.398674 , 1.893912 ]
## 
## Fitted model parameters:
##   sigsq: 0.639077 
##      x0: -0.69284 
## 
## Log-likelihood: -17.871792 
## 
## R thinks it has found the ML solution.

If we compare this to the fit obtained using the continuous likelihood function (rather than our discretized approximation) we should find that the parameter estimates & log-likelihood (very, very nearly) match. This similar will only increase with levs (or equivalently, as \(\delta\), the bin width of our discretization, decreases towards zero).

geiger::fitContinuous(phy,x,model="BM")
## GEIGER-fitted comparative model of continuous data
##  fitted 'BM' model parameters:
## 	sigsq = 0.642594
## 	z0 = -0.692552
## 
##  model summary:
## 	log-likelihood = -17.940158
## 	AIC = 39.880316
## 	AICc = 40.402055
## 	free parameters = 2
## 
## Convergence diagnostics:
## 	optimization iterations = 100
## 	failed iterations = 0
## 	number of iterations with same best fit = 100
## 	frequency of best fit = 1.000
## 
##  object summary:
## 	'lik' -- likelihood function
## 	'bnd' -- bounds for likelihood search
## 	'res' -- optimization iteration summary
## 	'opt' -- maximum likelihood parameter estimates

This is old news.

What you might not have realized is that our model object from bounded_bm can be passed to the generic method ancr and we ought to get back out MLE ancestral states – with a small catch that I’ll get to!

This is what I mean….

## ancestral state estimates from the discrete approximation
anc.d<-ancr(bm)
print(anc.d,printlen=5)
## Ancestral character estimates from "bounded_bm" object:
##        27        28        29        30        31     
##  -0.69284 -0.909587 -0.809253 -0.858961 -0.854084 ....
## 
## Lower & upper 95% CIs:
##        lower     upper
## 27  -0.69284  -0.69284
## 28 -1.394107  -0.42821
## 29 -1.208867 -0.414979
## 30 -1.169172 -0.547293
## 31 -1.155941 -0.547293
##         ....      ....
## ancestral states using continuous likelihood function
anc.c<-fastAnc(phy,x,CI=TRUE)
print(anc.c,printlen=5)
## Ancestral character estimates using fastAnc:
##         27       28        29        30       31     
##  -0.692552 -0.90973 -0.810438 -0.860237 -0.85374 ....
## 
## Lower & upper 95% CIs:
##        lower     upper
## 27 -1.378038 -0.007065
## 28 -1.438079  -0.38138
## 29 -1.220804 -0.400071
## 30 -1.174338 -0.546135
## 31 -1.167611 -0.539869
##         ....      ....

Let’s plot one set of point estimates against the other.

par(mar=c(5.1,4.1,1.1,1.1))
plot(anc.c$ace,anc.d$ace,bty="n",pch=21,
  bg=make.transparent("blue",0.5),
  xlab="estimates from fastAnc",
  ylab="estimates from bounded_bm > ancr",
  las=1,cex.axis=0.8)
grid()
abline(a=0,b=1,lty="dashed")

plot of chunk unnamed-chunk-9

I don’t need any fancy statistics to see that I’m getting the right estimates here!

There is one teeny, tiny problem though.

It becomes evident when we closely inspect our model object from ancr as follows.

print(anc.d,prinlen=5)
## Ancestral character estimates from "bounded_bm" object:
##        27        28        29        30        31        32     
##  -0.69284 -0.909587 -0.809253 -0.858961 -0.854084 -0.884419 ....
## 
## Lower & upper 95% CIs:
##        lower     upper
## 27  -0.69284  -0.69284
## 28 -1.394107  -0.42821
## 29 -1.208867 -0.414979
## 30 -1.169172 -0.547293
## 31 -1.155941 -0.547293
## 32 -1.169172 -0.600219
##         ....      ....

Here we see that the upper & lower bounds of the confidence intervals on the global root state are both equal to the point estimate which is clearly wrong. Indeed, this is typically the most uncertain internal node state, rather than the least!

This happens because to compute the likelihood under our discretized diffusion approximation and have it match the continuous density, when we get to the root state we have to fix it to its MLE.

This has the effect of meaning that when we go ahead & undertake ASR using this fitted model, instead of just conditioning on the MLE of \(\sigma^2\), we’re also conditioning on the MLE of \(x_0\), the root state. This is not what we want to be doing at all!

Well, in this case it doesn’t really matter because we can solve the continuous density anyway; however, for other models where that’s intractable, what I might tentatively recommend instead is changing the root prior in estimation if we intend to use our fitted model in ASR.

Here’s what that would look like for our twenty-six taxon phylogeny of earlier.

bm_flatroot<-bounded_bm(phy,x,levs=400,root="flat")
bm_flatroot
## Object of class "bounded_bm" based on
##    a discretization with k = 400 levels.
## 
## Unwrapped (i.e., bounded) model
## 
## Set or estimated bounds: [ -3.398674 , 1.893912 ]
## 
## Fitted model parameters:
##   sigsq: 0.664673 
##      x0: -0.692566 
## 
## Log-likelihood: -19.682352 
## 
## R thinks it has found the ML solution.
anc.d_flatroot<-ancr(bm_flatroot)
print(anc.d_flatroot,printlen=5)
## Ancestral character estimates from "bounded_bm" object:
##         27        28        29        30        31     
##  -0.692566 -0.909521 -0.809241 -0.858956 -0.854082 ....
## 
## Lower & upper 95% CIs:
##        lower     upper
## 27 -1.380876 -0.004803
## 28 -1.433802 -0.388516
## 29 -1.222098 -0.401747
## 30 -1.169172 -0.547293
## 31 -1.169172 -0.547293
##         ....      ....

Let’s once again compare these states (that is, the point estimates at nodes) to the continuous likelihood function values.

par(mar=c(5.1,4.1,1.1,1.1))
plot(anc.c$ace,anc.d_flatroot$ace,bty="n",pch=21,
  bg=make.transparent("blue",0.5),
  xlab="estimates from fastAnc",
  ylab="estimates from bounded_bm > ancr",
  las=1,cex.axis=0.8)
grid()
abline(a=0,b=1,lty="dashed")

plot of chunk unnamed-chunk-13

Once again, it’s evident that we have the right ML states.

Now, for fun, let’s compare the lower & upper confidence bounds as well….

par(mfrow=c(1,2),mar=c(5.1,4.1,2.1,1.1))
plot(anc.c$CI95[,1],anc.d_flatroot$CI95[,1],bty="n",pch=21,
  bg=make.transparent("blue",0.5),
  xlab="lower CI bound from fastAnc",
  ylab="lower CI bound from bounded_bm > ancr",
  las=1,cex.axis=0.8)
grid()
abline(a=0,b=1,lty="dashed")
mtext("a) lower CI bound",adj=0)
plot(anc.c$CI95[,2],anc.d_flatroot$CI95[,2],bty="n",pch=21,
  bg=make.transparent("blue",0.5),
  xlab="upper CI bound from fastAnc",
  ylab="upper CI bound from bounded_bm > ancr",
  las=1,cex.axis=0.8)
grid()
abline(a=0,b=1,lty="dashed")
mtext("b) upper CI bound",adj=0)

plot of chunk unnamed-chunk-14 Dang. I didn’t expect that to work this well!

Nice.

Of course, this actually gets interesting when we start to introduce models (such as bounded BM) without tractable solutions for the likelihood. Perhaps I’ll demo that in another post!

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.