For some time, I’ve been fascinated by the usefulness of the discretized diffusion approximation method of Boucher & Démery (2016) to obtain likelihood functions for specifically non-Gaussian evolutionary models for which no closed-form solution for the probability density under the model is available.
I can’t get into all of it here, but for those who are interested in getting some more details, I suggest checking out this blog post or my Evolution 2025 meeting talk on YouTube, along with numerous other posts to this blog (you can just search for "Boucher").
Prior uses of this approximation on my blog (and now in phytools) including fitting a multi-state threshold model, fitting discrete character & hidden-state dependent heterogeneous rate continuous character evolution models, and, of course, bounded Brownian motion.
Towards the end of last year, I posted on visualizing evolution under what I referred to as a “semi-threshold” trait evolution model.
Just as a reminder, the threshold model is a discrete character evolution model in which the observed trait is discretely-valued but is underlain by an unobserved continuous character called “liability.” Every time liability crosses some fixed threshold value, our discrete character changes in state.
A semi-threshold model, on the other hand, imagines that on some part of its range liability can be observed as a measurable quantitative trait. On another part of its range, by contrast, liability is not observable but nonetheless continues to evolve.
A simple example of a trait that could conceivably evolve in this manner might be “horn length” which can increase or decrease – but when it shrinks to zero the horn is absent and can no longer be measured. The ability to produce a horn (our “liability” trait, in this case), however, might continue to evolve even if the horn is no longer present and measurable. (That is, liability could continue to evolve away from simple absence, and thus make horns increasingly difficult to re-evolve.)
To get a sense of evolution under our “semi-threshold” model I created the visual simulation below (adapted from my previous post).
## load packages
library(phytools)
## set number of taxa
N<-80
## simulate a tree
tree<-pbtree(n=N,scale=100)
tree<-untangle(ladderize(tree),"read.tree")
## add a lot of non-branching nodes along each edge
tt<-map.to.singleton(make.era.map(tree,seq(0,100,by=1)))
## simulate liability (x)
x<-fastBM(tt,sig2=1.25,internal=TRUE,a=5)
## map the presence/absence of x as >0 or <0
maps<-list()
for(i in 1:nrow(tt$edge)){
if(x[tt$edge[i,1]]<0 && x[tt$edge[i,2]]<0)
maps[[i]]<-setNames(tt$edge.length[i],"0")
else if(x[tt$edge[i,1]]>0 && x[tt$edge[i,2]]>0)
maps[[i]]<-setNames(tt$edge.length[i],"1")
else {
d<-x[tt$edge[i,2]]-x[tt$edge[i,1]]
if(d>0)
maps[[i]]<-setNames(abs(c(-x[tt$edge[i,1]]/d,
x[tt$edge[i,2]]/d))*tt$edge.length[i],
0:1)
else
maps[[i]]<-setNames(abs(c(x[tt$edge[i,1]]/d,
x[tt$edge[i,2]]/d))*tt$edge.length[i],
1:0)
}
}
tt$maps<-maps
class(tt)<-c("simmap","phylo")
## create phenogram plot
par(mar=c(5.1,5.1,1.1,1.1),mfrow=c(2,1))
tmp<-phenogram(tt,x,
lty=setNames(c("solid","solid"),0:1),
colors=setNames(c("black","blue"),0:1),
lty=setNames(c("dotted","solid"),0:1),
xlim=c(-5,110),ylim=rep(max(abs(x)),2)*c(-1,1),
ylab="liability / phenotype",las=1,cex.axis=0.9,
ftype="off")
lines(c(0,100),c(0,0))
## label the threshold
text(0,-0.06*max(abs(x)),"threshold",cex=0.8,pos=4,font=3)
## label the trait ranges
text(x=-7,y=mean(c(0,max(abs(x)))),"trait present",srt=90,
cex=0.8,font=3)
lines(c(-3,-5,-5,-3),c(0.5,0.5,max(abs(x)),max(abs(x))))
text(x=-7,y=mean(c(0,-max(abs(x)))),"trait absent",srt=90,
cex=0.8,font=3)
lines(c(-3,-5,-5,-3),c(-0.5,-0.5,-max(abs(x)),-max(abs(x))))
mtext("a)",line=0,adj=-0.1)
## create contMap plot
obj<-contMap(tt,x[1:Ntip(tt)],anc.states=x[1:tt$Nnode+Ntip(tt)],
lims=rep(max(abs(x)),2)*c(-1,1),plot=FALSE,method="user")
## re-color so that any value of trait < 0 is grey
obj<-setMap(obj,c(rep("black",500),
colorRampPalette(c("white","blue"))(500)))
## plot contMap
plot(obj,mar=c(1.1,5.1,0.1,1.1),legend=FALSE,
xlim=c(-5,110),ftype="off",ylim=c(-0.2*N,N),
lwd=3,outline=FALSE)
segments(x0=rep(100,Ntip(tt))+1,x1=tmp[,"x"],
y0=1:Ntip(obj$tree),y1=1:Ntip(obj$tree),lty="dotted")
text(obj$tree$tip.label,x=tmp[,"x"],y=1:Ntip(obj$tree),
pos=4,cex=0.6)
## add legend
add.color.bar(60,cols=colorRampPalette(c("white","blue"))(100),
prompt=FALSE,lims=c(0,max(abs(x))),x=0,y=-0.1*N,
title="quantitative trait value (if present)",subtitle="")
lines(x=c(0,10),y=rep(-0.2*N,2),lwd=6,lend=4)
lines(x=c(0,10),y=rep(-0.2*N,2),lwd=4,lend=4,col="black")
text(x=11,y=-0.2*N,"trait absent",pos=4)
mtext("b)",line=0,adj=-0.1)
This visualization shows the evolution of normally unobserved liability (when below the threshold) and our observed continuous trait (when above it), in panel (a); and evolution of our semi-threshold trait mapped onto the branches of the tree, in panel (b). Once again, we could imagine that our trait is the horn – which can have a length if present, but for which we imagine that the underlying continuous basis may continue to change even if the trait is no longer produced!
For good measure, here’s how our semi-threshold trait is encoded in this case. Here, all values below zero (in this case) for the liability character are just set equal to zero, and our trait has been “semi-thresholded.”
## semi-threshold trait
y<-x[tree$tip.label]
y[y<0]<-0
print(y,digits=2)
## t63 t64 t38 t24 t34 t35 t10 t52 t53 t18 t8 t58 t59 t45
## 10.49 10.88 13.02 1.52 18.95 8.16 0.69 19.47 13.86 22.08 15.20 6.76 7.77 0.00
## t11 t6 t65 t66 t47 t39 t77 t78 t37 t79 t80 t36 t32 t31
## 4.09 9.78 11.79 6.75 7.49 10.75 10.88 11.05 6.85 0.00 0.00 0.00 0.00 0.00
## t5 t42 t43 t71 t72 t15 t16 t27 t28 t23 t12 t2 t3 t4
## 0.00 2.73 0.00 0.00 0.00 0.00 0.00 5.38 15.72 3.08 11.26 22.19 12.90 19.65
## t56 t57 t41 t1 t69 t70 t60 t33 t48 t49 t19 t20 t29 t30
## 0.00 0.00 0.00 9.80 11.10 9.20 4.42 0.00 0.00 1.25 0.00 0.14 1.42 0.00
## t73 t74 t40 t22 t75 t76 t13 t14 t50 t51 t46 t17 t67 t68
## 10.39 9.10 3.93 13.86 16.66 18.08 23.18 25.47 3.84 9.87 1.88 8.05 22.46 21.85
## t21 t7 t61 t62 t44 t25 t26 t54 t55 t9
## 16.65 19.59 17.60 21.46 19.02 4.07 2.45 10.90 10.71 0.00
Today, I implemented a prototype fitSemiThresh
function to fit the semi-threshold model to data using the discretized diffusion approximation to compute the likelihood, as follows.
fitSemiThresh<-function(tree,x,threshold=c(0,1),...){
if(hasArg(trace)) trace<-list(...)$trace
else trace<-0
if(hasArg(levs)) levs<-list(...)$levs
else levs<-200
if(hasArg(root)) root=list(...)$root
else root<-"mle"
if(root=="nuisance") pi<-"fitzjohn"
else if(root=="mle") pi<-"mle"
else if(root=="flat") pi<-rep(1/levs,levs)
if(hasArg(lik.func)) lik.func<-list(...)$lik.func
else lik.func<-"pruning"
if(hasArg(expm.method)) expm.method<-list(...)$expm.method
else expm.method<-"R_Eigen"
if(hasArg(p)) p<-list(...)$p
else p<-4
C<-vcv(tree)
N<-Ntip(tree)
## continuous character
x<-x[tree$tip.label]
if(is.null(threshold)){
threshold<-range(x)
LIMS<-phytools:::expand.range(threshold,p=p)
} else if(threshold[1]==-Inf&&threshold[2]==Inf){
LIMS<-phytools:::expand.range(range(x),p=p)
threshold<-LIMS
} else if(threshold[2]==Inf){
LIMS<-phytools:::expand.range(c(threshold[1],max(x)),p=p)
threshold[2]<-LIMS[2]
} else if(threshold[1]==-Inf){
LIMS<-phytools:::expand.range(c(min[x],threshold[2]),p=p)
threshold[1]<-LIMS[1]
} else {
LIMS<-phytools:::expand.range(threshold,p=p)
}
dd<-diff(LIMS)
tol<-1e-8*dd/levs
n2<-round(diff(threshold)/diff(LIMS)*levs)+1
bb<-seq(threshold[1]+tol,threshold[2]-tol,length.out=n2)
w<-diff(range(bb))/(n2-1)
n1<-if(LIMS[1]<threshold[1])
round((threshold[1]-LIMS[1])/diff(LIMS)*levs) else 0
n3<-levs-n2-n1+1
bb<-c(seq(to=threshold[1]-w,by=w,length.out=n1),
bb,
seq(from=threshold[2]+w,by=w,length.out=n3))
bins<-cbind(bb[1:levs],bb[2:(levs+1)])
X<-phytools:::to_binned(x,bins)
ll<-which(bins[,1]<=threshold[1])
uu<-which(bins[,2]>=threshold[2])
for(i in 1:length(x)){
if(x[i]<=threshold[1]) X[i,ll]<-1
else if(x[i]>=threshold[2]) X[i,uu]<-1
}
MODEL<-matrix(0,ncol(X),ncol(X),
dimnames=list(colnames(X),colnames(X)))
for(i in 1:(nrow(MODEL)-1)) MODEL[i,i+1]<-MODEL[i+1,i]<-1
pic_x<-pic(x,multi2di(tree))
nn<-max(c(floor(0.2*multi2di(tree)$Nnode),1))
q.init<-runif(n=1,0,2)*
(1/2)*mean(sort(pic_x^2,decreasing=TRUE)[1:nn])*
(levs/dd)^2
if(hasArg(max.sigsq)) max.sigsq<-list(...)$max.sigsq
else max.sigsq<-2*mean(sort(pic_x^2,decreasing=TRUE)[1:nn])
max.q=(1/2)*max.sigsq*(levs/dd)^2
mk_fit<-fitMk(tree,X,model=MODEL,pi=pi,lik.func=lik.func,
expm.method=expm.method,logscale=TRUE,max.q=max.q,
q.init=q.init)
lik<-logLik(mk_fit)-Ntip(tree)*log(dd/levs)
x0<-sum(mk_fit$pi*rowMeans(bins))
sigsq<-2*mk_fit$rates*(dd/levs)^2
object<-list(
sigsq=sigsq,
x0=x0,
threshold=threshold,
ncat=levs,
logLik=lik,
opt_results=mk_fit$opt_results,
at_bounds=(sigsq>=(max.sigsq-tol)),
mk_fit=mk_fit)
class(object)<-"fitSemiThresh"
object
}
print.fitSemiThresh<-function(x,digits=6,...){
cat(paste("Object of class \"fitSemiThresh\" based on\n",
" a discretization with k =",
x$ncat,"levels.\n"))
cat(paste("Set or estimated thresholds: [",round(x$threshold[1],digits),
",",round(x$threshold[2],digits),"]\n\n"))
cat("Fitted model parameters:\n")
if(is.null(x$CI)) cat(paste(" sigsq:",round(x$sigsq,6),"\n"))
else cat(paste(" sigsq:",round(x$sigsq,6)," [",round(x$CI[1],4),
",",round(x$CI[2],4),"]\n"))
cat(paste(" x0:",round(x$x0,6),"\n\n"))
cat(paste("Log-likelihood:",round(x$logLik,digits),"\n\n"))
if(x$opt_results$convergence == 0 && !x$at_bounds)
cat("R thinks it has found the ML solution.\n\n")
else if(x$at_bounds) cat("Optimization may be at bounds.\n\n")
else cat("R thinks optimization may not have converged.\n\n")
}
Let’s try it!
fitted_model<-fitSemiThresh(tree,y,threshold=c(0,Inf))
fitted_model
## Object of class "fitSemiThresh" based on
## a discretization with k = 200 levels.
## Set or estimated thresholds: [ 0 , 63.683218 ]
##
## Fitted model parameters:
## sigsq: 1.175643
## x0: 7.387253
##
## Log-likelihood: -194.625829
##
## R thinks it has found the ML solution.
Cool. These parameter estimates actually come out very close to our generating values of \(\sigma^2 = 1.25\) and \(x_0 = 5\).
We can contrast that with what we might obtain using geiger::fitContinuous
under an assumption of simple, unbounded & un-thresholded Brownian motion. Ignoring the threshold, we’ll tend to underestimate \(\sigma^2\) compared to its generating value.
geiger::fitContinuous(tree,y)
## GEIGER-fitted comparative model of continuous data
## fitted 'BM' model parameters:
## sigsq = 0.777791
## z0 = 8.761141
##
## model summary:
## log-likelihood = -241.213382
## AIC = 486.426764
## AICc = 486.582608
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## number of iterations with same best fit = 100
## frequency of best fit = 1.000
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
It gets even worse when there are thresholds at the upper and lower boundaries of the traits.
Let’s create a trait this property using \(\sigma^2 = 0.3\), \(x_0 = 0.5\), and thresholds at \([0,1]\)
N<-200
tree<-pbtree(n=N,scale=1)
x<-fastBM(tree,a=0.5,sig2=0.3)
x[x<0]<-0
x[x>1]<-1
hist(x,
main="distribution of semi-thresholded trait values",
font.main=3,
xlab="trait",breaks=seq(-0.025,1.025,by=0.05))
fitted_model<-fitSemiThresh(tree,x,threshold=c(0,1),
levs=200)
fitted_model
## Object of class "fitSemiThresh" based on
## a discretization with k = 200 levels.
## Set or estimated thresholds: [ 0 , 1 ]
##
## Fitted model parameters:
## sigsq: 0.285624
## x0: 0.61
##
## Log-likelihood: 117.02295
##
## R thinks it has found the ML solution.
Once again, we can contrast this with geiger::fitContinuous
treating the character as if it was evolving without bounds.
geiger::fitContinuous(tree,x)
## GEIGER-fitted comparative model of continuous data
## fitted 'BM' model parameters:
## sigsq = 0.210468
## z0 = 0.596543
##
## model summary:
## log-likelihood = 32.199314
## AIC = -60.398629
## AICc = -60.337715
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## number of iterations with same best fit = 100
## frequency of best fit = 1.000
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
It’s an important point to make that this model has standard Brownian motion as a special case but in which the “thresholds” are set to \([-\infty, \infty]\).
Let me show that quickly here whilst increasing levs
to levs=400
….
x<-fastBM(tree<-pbtree(n=100))
fitSemiThresh(tree,x,threshold=c(-Inf,Inf),levs=400)
## Object of class "fitSemiThresh" based on
## a discretization with k = 400 levels.
## Set or estimated thresholds: [ -18.135385 , 20.253821 ]
##
## Fitted model parameters:
## sigsq: 0.790013
## x0: 1.58707
##
## Log-likelihood: -121.809603
##
## R thinks it has found the ML solution.
geiger::fitContinuous(tree,x)
## GEIGER-fitted comparative model of continuous data
## fitted 'BM' model parameters:
## sigsq = 0.780540
## z0 = 1.564920
##
## model summary:
## log-likelihood = -121.335715
## AIC = 246.671429
## AICc = 246.795141
## free parameters = 2
##
## Convergence diagnostics:
## optimization iterations = 100
## failed iterations = 0
## number of iterations with same best fit = 100
## frequency of best fit = 1.000
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
As always under the discretized diffusion approximation, this similarity should go up for increasing values of levs
, but at what can be very significant computational cost!
Finally, just as with bounded_bm
, fitThresh
will always tend to explain our data better than unbounded Brownian motion for thresholds (or bounds) set to the observable limits of our trait.
That’s because the effect is to “fold” the tail of the probability density back in at the edges, making our most extreme observations more probable.
As such, this model should be fit to data when we have genuine reason to suspect that our trait is evolving under a semi-threshold property – either on logical grounds, or based on the distribution of observed values among our tip species. In the latter case it could then be competed against alternative models with similar among-species trait value distribution, such as the reflective bounded model or a continuous character model with sticky bounds (not yet implemented, but not hard to imagine).
Look for more on this soon!