Saturday, July 5, 2025

A discrete character dependent mult-trended Brownian motion with a trend model using the discretized diffusion approximation

In two recent posts to this blog (1, 2) I have described using the discretized diffusion approximation of Boucher & Démery (2016) to fit a discrete trait dependent multi-trend trended Brownian motion model.

I’ve been playing around with this and it’s pretty good, actually.

To start, why don’t we simulate trended evolution on its own and show that absent a discrete character specific difference in trend between different parts of the tree, we obtain the same fit and likelihood from phytools::fitmultiTrend as we would have from geiger::fitContinuous. Remember, this model is only identifiable for non-ultrametric (e.g., tip-dated) trees.

library(phytools)
packageVersion("phytools")
## [1] '2.4.22'
library(geiger)
## simulate a tree
tree<-rtree(n=200)
## scale to 10 units long
tree$edge.length<-tree$edge.length/
  max(nodeHeights(tree))*10
## this is just for visualization purposes
simt<-map.to.singleton(make.era.map(tree,
  seq(0,max(nodeHeights(tree)),
    length.out=201)[1:200]))
## plot tree
par(mfrow=c(1,2))
plotTree(tree,ftype="off",mar=c(5.1,4.1,1.1,1.1),
  lwd=1)
axis(1,at=seq(0,10,by=2))
title(xlab="time")
## simulate trait
par(las=1)
x<-sim.multiTrend(simt,sig2=0.8,mu=-1.2,
  plot=TRUE)
legend("bottomleft",
  expression(paste(sigma^2," = 0.8; ",
    mu," = -1.2")),cex=0.8,
  bty="n")

plot of chunk unnamed-chunk-8

First, let’s use fitmultiTrend – even though, in this case, we have an analytic function for the likelihood.

fit_dda<-fitmultiTrend(tree,x,levs=200,
  parallel=TRUE)
fit_dda
## Object of class "fitmultiTrend" based on
##     a discretization with k = 200 levels.
## 
## Fitted multi-rate trend model parameters:
##  levels: [ 0 ]
##   sigsq: [ 0.758 ]
##   mu: [ -1.2399 ]
##      x0: -0.9729 
## 
## Estimated Q matrix:
##   0
## 0 0
## 
## Log-likelihood: -305.6603 
## 
## R thinks it has found the ML solution.

Let’s now compare this to what we would’ve obtained using geiger::fitContinuous.

fit_geiger<-fitContinuous(tree,x,model="mean_trend")
fit_geiger
## GEIGER-fitted comparative model of continuous data
##  fitted 'mean_trend' model parameters:
## 	sigsq = 0.747939
## 	drift = -1.235807
## 	z0 = -1.005460
## 
##  model summary:
## 	log-likelihood = -304.778169
## 	AIC = 615.556338
## 	AICc = 615.678787
## 	free parameters = 3
## 
## Convergence diagnostics:
## 	optimization iterations = 100
## 	failed iterations = 0
## 	number of iterations with same best fit = 58
## 	frequency of best fit = 0.580
## 
##  object summary:
## 	'lik' -- likelihood function
## 	'bnd' -- bounds for likelihood search
## 	'res' -- optimization iteration summary
## 	'opt' -- maximum likelihood parameter estimates

We should see that our parameter estimates & log-likelihood match quite closely – and they would become even more close for increased levs.

Next, using the same tree, we can simulate under the multi-trend model as follows.

## Q matrix for simulating discrete trait
Q<-matrix(c(-0.75,0.75,0.75,-0.75),2,2,
  dimnames=list(letters[1:2],letters[1:2]))
## simulated discrete trait history
simt<-sim.history(simt,Q,message=FALSE)
## pull off discrete factor
y<-as.factor(getStates(simt,"tips"))
## plot 
par(mfrow=c(1,2),las=1)
cols<-setNames(palette()[c(2,4)],letters[1:2])
plot(simt,colors=cols,ftype="off",
  mar=c(5.1,4.1,1.1,1.1))
axis(1,at=seq(0,10,by=2))
title(xlab="time")
sig2<-setNames(c(0.4,0.8),letters[1:2])
mu<-setNames(c(-1.0,1.2),letters[1:2])
x<-sim.multiTrend(simt,sig2=sig2,mu=mu,plot=TRUE,
  col=palette()[c(2,4)])
grid()
legend("bottomleft",
  c(expression(paste(sigma^2," = 0.4;  ",
    mu," = -1.0")),
    expression(paste(sigma^2," = 0.8;  ",
      mu," = 1.2"))),
  col=palette()[c(2,4)],lwd=2,cex=0.8,
  bty="n")

plot of chunk unnamed-chunk-11

Here, we have no other option than to fit this model using our discretized diffusion approximation, so all we can do is compare our estimates to their generating values. Remember, here we generated \(\sigma^2 = [0.4, 0.8]\), \(\mu = [-1.0, 1.2]\), and \(q = 0.75\)

fit_sdm<-fitmultiTrend(tree,x,y,parallel=TRUE,
  levs=100,ncores=10)
fit_sdm
## Object of class "fitmultiTrend" based on
##     a discretization with k = 100 levels.
## 
## Fitted multi-rate trend model parameters:
##  levels: [ a, b ]
##   sigsq: [ 0.2403, 0.8818 ]
##   mu: [ -0.9875, 1.1334 ]
##      x0: -0.8209 
## 
## Estimated Q matrix:
##            a          b
## a -0.6517332  0.6517332
## b  0.6517332 -0.6517332
## 
## Log-likelihood: -458.933 
## 
## R thinks it has found the ML solution.

That’s pretty cool, I guess.

I’ve also discovered although the multi-trend model remains non-identifiable on an ultrametric tree, the difference between trends becomes identifiable. More on this later.

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.