Monday, June 24, 2024

Parameter estimation of discrete state-dependent multi-rate continuous character evolution using the discrete approximation of Boucher & Démery (2016): A comparative analysis

As of just last week there is a new version of phytools (2.3-0) on CRAN with lots of new additions, features, & functions. Please check it out!

Dirk Eddelbuettel also provides a compact summary of the phytools updates since the last CRAN version (phytools 2.1-1) on his CRANberries aggregator (also screen-shotted below).

As I’ve been writing about recently (1, 2, 3, 4, 5), one of the new functions in phytools 2.3-0 is fitmultiBM which fits a discrete character state dependent multi-rate Brownian motion model using the discrete approximation of Boucher & Démery (2016). (Actually, to follow along in this post, I’d recommend updating phytools from GitHub as I’ve already posted a few small improvements to fitmultiBM there.)

What I’d like to do today is examine parameter estimation under the model – and perhaps compare it to parameter estimation from existing workflows to model multi-rate discrete character dependent continuous trait evolution.

Let’s begin by loading phytools.

library(phytools)
packageVersion("phytools")
## [1] '2.3.1'

Now I’m going to undertake a very limited simulation study here – mostly because computation is still quite slow for this function.

First, I’ll simulate forty 200-taxon pure-birth trees using pbtree with a total tree length of 1.0.

Next, I’ll generate a discrete character history on each of them using sim.history. I’ll simulate equal back-and-forth rates between two character levels ("a" and "b") with a transition rate, \(q\), drawn from a uniform distribution on \([1,10]\). Note that these are quite high transition rates for a tree of length 1.0. This was done on purpose, for reasons I’ll discuss below.

Finally, I’ll simulate a continuous trait evolving with a Brownian rate that depends on the discrete character: with \(\sigma_{a}^2\) drawn randomly from a uniform distribution on \([0.5,2]\) and \(\sigma_{b}^2\) drawn randomly on \([5,20]\).

## set simulation conditions
nsim<-40
N<-200
Q<-matrix(c(-1,1,1,-1),2,2,
  dimnames=list(c("a","b"),c("a","b")))
q<-runif(n=nsim,1,10)
sig2<-replicate(nsim,
  setNames(c(runif(n=1,0.5,2),runif(n=1,5,20)),
    c("a","b")),simplify=FALSE)
## simulate trees and data
trees<-pbtree(n=N,nsim=nsim,scale=1)
tt<-mapply(sim.history,tree=trees,
  Q=lapply(q,function(q,Q) q*Q,Q=Q),
  MoreArgs=list(message=FALSE),SIMPLIFY=FALSE)
y<-lapply(tt,getStates,type="tips")
x<-mapply(sim.rates,tree=tt,sig2=sig2,
  SIMPLIFY=FALSE)

We can now fit our models.

In the code below, I’m using phytools::fitmultiBM to fit our discrete character state dependent quantitative trait evolution model. I set parallel=TRUE so that internally-used functions parallelize matrix exponentiation, which helps alot with the computation. If you re-run my code, you’ll find that it’s printing out the log-likelihood from each analysis. I’m also using try to make sure that one optimization failure doesn’t kill the whole study!

I’m pretty confident I got convergence to the ML solution for each dataset; however, in practical empirical cases I’d recommend running multiple optimization iterations & would be happy to illustrate that in another post.

## create list to populate
fits_mbm<-list()
## iterate over simulated data
for(i in 1:nsim){
  cat(paste("dataset #",i,": \n",sep=""))
  ## repeat if non-convergent or fails
  fits_mbm[[i]]<-list(opt_results=list(convergence=1))
  while(fits_mbm[[i]]$opt_results$convergence!=0){
    class(fits_mbm[[i]])<-"try-error"
    while(inherits(fits_mbm[[i]],"try-error")){
      fits_mbm[[i]]<-try(fitmultiBM(tree=trees[[i]],
        x=x[[i]],y=y[[i]],parallel=TRUE))
    }
    cat(paste("\tlog(L) = ",
      paste(round(logLik(fits_mbm[[i]]),3),
      collapse=" "),"\n",sep=""))
  }
}

OK, so let’s take a look at our results.

Firstly, in a prior post on this blog I predicted that when the rate of our continuous trait did indeed vary as a function of our discrete character, then fitting a model of their joint evolution should result in more accurate estimation of the transition rates between character levels for our discrete trait. This is because the continuous character (in that case) should contain information about the evolutionary process for our discrete trait. By taking this information into account when we jointly estimate \(q\), our estimate should become more accurate!

Let’s see if that’s true by graphing our estimates of \(q\) from fitmultiBM as well as values obtained conventionally using phytools::fitMk.

par(mfrow=c(1,2),mar=c(5.1,5.1,2.1,2.1))
est_q<-sapply(fits_mbm,function(x) x$rates)
plot(q,est_q,log="xy",xlim=c(1,20),ylim=c(1,20),
  xlab="true value of q",
  ylab="fitmultiBM estimate of q",pch=21,bg="grey",
  cex=1.5,las=1,cex.axis=0.8)
abline(a=0,b=1,lty="dotted")
mk_q<-mapply(function(t,x) fitMk(t,x)$rates,
  t=trees,x=y)
plot(q,mk_q,log="xy",xlim=c(1,20),ylim=c(1,20),
  xlab="true value of q",
  ylab="fitMk estimate of q",pch=21,bg="grey",
  cex=1.5,las=1,cex.axis=0.8)
abline(a=0,b=1,lty="dotted")

plot of chunk unnamed-chunk-6

These two panels look quite similar, though I see slightly more scatter in the right compared to the left panel. Let’s see if I’m correct.

## estimates from fitmultiBM
cor(q,est_q)
## [1] 0.8032182
## estimates from fitMk
cor(q,mk_q)
## [1] 0.7641394

Indeed, this shows that we (slightly) more accurately estimated the transition rate between discrete character levels when we jointly estimated it with our multi-rate Brownian evolution model. This needs more investigation, but is pretty convincing at first glance.

Next, moving to the more important question, let’s see how we did in estimating our \(\sigma^2\) values themselves.

Here, I’ll just plot the estimates of \(\sigma_a^2\) and \(\sigma_b^2\) in separate panels.

par(mfrow=c(1,2),mar=c(5.1,5.1,2.1,2.1))
sig2_est<-t(sapply(fits_mbm,function(x) x$sigsq))
sig2_gen<-t(sapply(sig2,function(x) x))
plot(sig2_gen[,1],sig2_est[,1],log="xy",
  xlim=range(c(sig2_gen[,1],sig2_est[,1])),
  ylim=range(c(sig2_gen[,1],sig2_est[,1])),
  xlab=expression(paste("true value of ",
    sigma[a]^2)),
  ylab=expression(paste("fitmultiBM estimate of ",
    sigma[a]^2)),pch=21,bg="grey",cex=1.5,las=1,
  cex.axis=0.8)
abline(a=0,b=1,lty="dotted")
plot(sig2_gen[,2],sig2_est[,2],log="xy",
  xlim=range(c(sig2_gen[,2],sig2_est[,2])),
  ylim=range(c(sig2_gen[,2],sig2_est[,2])),
  xlab=expression(paste("true value of ",
    sigma[b]^2)),
  ylab=expression(paste("fitmultiBM estimate of ",
    sigma[b]^2)),pch=21,bg="grey",cex=1.5,las=1,
  cex.axis=0.8)
abline(a=0,b=1,lty="dotted")

plot of chunk unnamed-chunk-9

This is hard to assess, but it would seem relatively unbiased. We would need more than 40 simulations to check for sure!

What’s going to be much more interesting, however, is to compare this result to a standard workflow for analyzing multi-rate Brownian evolution in a continuous character that we hypothesize varies as a function of a continuous trait.

In a recent post I characterized this analytical workflow as follows: “The standard way to investigate this type of hypothesis in phylogenetic comparative biology has been to first map the discrete character history on the tree (using, for example, stochastic character mapping), and then fit a regime-dependent model following O’Meara et al. (2006).” We would then average across our set of maps to obtain parameter estimates of \(\sigma_a^2\) and \(\sigma_b^2\).

I described how this can result in a biased underestimate of the difference in rate between regimes in a 2013 paper published in Systematic Biology.

In the code chunk below, I apply this workflow using fitMk, simmap, and brownie.lite in phytools. (For anyone following along, you might considering parallelizing your brownie.lite optimizations using the foreach package. I wish I had!)

mk_fits<-mapply(fitMk,tree=trees,x=y,SIMPLIFY=FALSE)
smaps<-lapply(mk_fits,simmap)
bm_sig2<-matrix(NA,nrow=nsim,ncol=2,
  dimnames=list(1:nsim,c("a","b")))
for(i in 1:nsim){
  ss_fits<-lapply(smaps[[i]],brownie.lite,x=x[[i]])
  bm_sig2[i,]<-rowMeans(sapply(ss_fits,
    function(x) x$sig2.multiple[c("a","b")]))
}

Finally, let’s graph our results.

This time, and since we already noted (above) that \(\sigma_a^2\) and \(\sigma_b^2\) were estimated more or less unbiasedly by fitmultiBM, why don’t we focus on the ratio of and \(\sigma_b^2/\sigma_a^2\)? Indeed, it’s the difference in rate between discrete character levels that Revell (2013) tells us should be underestimated in our standard fitMk \(\rightarrow\) simmap \(\rightarrow\) brownie.lite workflow.

par(mfrow=c(1,2),mar=c(5.1,5.1,3.1,2.1))
est_sigsq<-t(sapply(fits_mbm,function(x) x$sigsq))
plot(sig2_gen[,2]/sig2_gen[,1],
  est_sigsq[,2]/est_sigsq[,1],
  log="xy",xlim=c(1,30),ylim=c(1,30),
  xlab=expression(paste("generating ",
    sigma[b]^2/sigma[a]^2)),
  ylab=expression(paste("fitmultiBM estimated ",
    sigma[b]^2/sigma[a]^2)),pch=21,bg="grey",
  cex=1.5,las=1,cex.axis=0.8)
abline(a=0,b=1,lty="dotted")
mtext("a) fitmultiBM analysis",line=1,adj=-0.2)
plot(sig2_gen[,2]/sig2_gen[,1],bm_sig2[,2]/bm_sig2[,1],
  log="xy",xlim=c(1,30),ylim=c(1,30),
  xlab=expression(paste("generating ",
    sigma[b]^2/sigma[a]^2)),
  ylab=expression(paste("brownie.lite estimated ",
    sigma[b]^2/sigma[a]^2)),pch=21,bg="grey",
  cex=1.5,las=1,cex.axis=0.8)
abline(a=0,b=1,lty="dotted")
mtext("b) standard workflow",line=1,adj=-0.2)

plot of chunk unnamed-chunk-12

Cool!

This shows us just how badly separate (rather than joint) estimation of the discrete and continuous character evolution causes us to underestimate the true rate heterogeneity due to our discrete trait.

Finally, I earlier noted that the selection of a high rate of evolution of the discrete character, \(q\), had been critical and that I would circle back to why that was the case.

This was because if our discrete character changes rarely then, most likely, all of our stochastic maps will capture something very close to the real evolutionary history of our discrete trait and the traditional workflow will be unbiased for \(\sigma_b^2/\sigma_a^2\). (Indeed, this is exactly what I found in Revell 2013). Figure 2 from that article shows precisely this pattern.

More on all this stuff later, of course, but that’s all for now folks!

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.