Sunday, June 30, 2024

Running multiple optimization iterations & other things for phytools::fitmultiBM

A couple of days ago a colleague asked me about running multiple optimization iterations of the new phytools function fitmultiBM, as well as some other stuff pertaining to how it can be used.

I thought I would create a quick demo of this and some other things. I’m still working on some of the details of this method and it’s implementation. Nonetheless, let’s go….

Readers following on should install phytools 2.3-2 from GitHub as I intend to use some features of fitmultiBM absent from the current CRAN version.

library(phytools)
packageVersion("phytools")
## [1] '2.3.2'

For this example, I’m going to use data for the totally real fish group called ‘grunters’ (Terapontidae). The phylogeny and trait data for this demo come from Davis et al. (2016) and can be downloaded from the website for our book.

grunter_tree<-read.tree(
  file="http://www.phytools.org/Rbook/11/terapontidae.phy")
grunter_tree
## 
## Phylogenetic tree with 38 tips and 37 internal nodes.
## 
## Tip labels:
##   Terapon_jarbua, Pelsartia_humeralis, Mesopristes_cancellatus, Mesopristes_argenteus, Rhyncopelates_oxyrhynchus, Terapon_puta, ...
## 
## Rooted; includes branch lengths.
grunter_data<-read.csv(
  file="http://www.phytools.org/Rbook/11/terapontidae.csv",
  row.names=1)
head(grunter_data)
##                         Diet     Animal      Plant        PC1         PC2
## Amniataba_affinis          2  0.2818512 -0.2818512  0.2913522 -0.21526981
## Amniataba_caudavittatus    1  2.5866893 -2.5866893 -0.6336029  0.15468468
## Amniataba_percoides        2 -0.2330491  0.2330491  0.1912102  0.07985197
## Bidyanus_bidyanus          3 -2.1972246  2.1972246  0.4823617 -0.78346644
## Bidyanus_welchi            2  0.8472979 -0.8472979  0.2891397 -0.77832681
## Hannia_greenwayi_1         2  1.1970524 -1.1970524 -0.1860913 -0.29919632

In our dataset I can see that our discrete character, Diet, is coded as a numeric variable. We could leave it like this, but I know from the study that state 1 is carnivore, 2 is omnivore, and 3 is herbivore, so let’s re-code our character to match these states.

diet_cat<-setNames(grunter_data$Diet,rownames(grunter_data))
diet<-setNames(
  as.factor(c("carnivore","omnivore","herbivore")[diet_cat]),
  names(diet_cat))

Here’s our discrete trait mapped onto the tips of the tree.

plotTree(grunter_tree,direction="upwards",fsize=0.8,
  ftype="i",offset=3)
tiplabels(pie=to.matrix(diet[grunter_tree$tip.label],
  levels(diet)),cex=0.5,offset=1.5,
  piecol=hcl.colors(n=3,palette="plasma")[c(2,1,3)])
legend("bottomright",levels(diet),pch=21,
  pt.bg=hcl.colors(n=3,palette="plasma")[c(2,1,3)],
  cex=0.8,pt.cex=2,inset=0.03,bty="n",horiz=TRUE)

plot of chunk unnamed-chunk-20

The continuous character we’re going to focus on for this analysis is PC1 in the data frame. This is a principle component from an analysis of dental morphology. (For more details, see Davis et al. 2016)

pc1<-setNames(grunter_data$PC1,rownames(grunter_data))
head(pc1)
##       Amniataba_affinis Amniataba_caudavittatus     Amniataba_percoides       Bidyanus_bidyanus 
##               0.2913522              -0.6336029               0.1912102               0.4823617 
##         Bidyanus_welchi      Hannia_greenwayi_1 
##               0.2891397              -0.1860913

Let’s proceed to fit our multi-rate model.

In this case, I’m going to begin by assuming that evolution is ordered such that carnivore \(\leftrightarrow\) omnivore \(\leftrightarrow\) herbivore, but in which the back-and-forth transition rates between adjacent levels are allowed to be asymmetric and different between each pair of states. (This is the thing that we couldn’t have done with the CRAN version of phytools!)

## design matrix for discrete character model
model<-matrix(c(0,0,1,0,0,2,3,4,0),3,3,
  dimnames=list(levels(diet),levels(diet)))
model
##           carnivore herbivore omnivore
## carnivore         0         0        3
## herbivore         0         0        4
## omnivore          1         2        0
pc1_multirate<-fitmultiBM(grunter_tree,pc1,diet,
  model=model,parallel=TRUE,levs=100,plot_model=TRUE)
## This is the design of your custom discrete-trait model:
##           carnivore herbivore omnivore
## carnivore         0         0        3
## herbivore         0         0        4
## omnivore          1         2        0

plot of chunk unnamed-chunk-23

(My apologies for the random palette. To my colorblind eye, the 'carnivore' and 'herbivore' continuous character rate shades look virtually identical, but I assure you they are different in the fitted model! My non-colorblind wife tells me they do look different to her eye.)

pc1_multirate
## Object of class "fitmultiBM" based on
##     a discretization with k = 100 levels.
## 
## Fitted multi-rate BM model parameters:
##  levels: [ carnivore, herbivore, omnivore ]
##   sigsq: [ 0.0035, 0.0293, 0.1598 ]
##      x0: -0.406 
## 
## Estimated Q matrix:
##             carnivore   herbivore    omnivore
## carnivore -0.03639316  0.00000000  0.03639316
## herbivore  0.00000000 -0.05789947  0.05789947
## omnivore   0.07757236  0.05445299 -0.13202536
## 
## Log-likelihood: -61.0404 
## 
## R thinks it has found the ML solution.

According to this fitted model, the condition omnivore has the highest \(\sigma^2\) rate of trait evolution for pc1, carnivore is the lowest, and herbivore is in between.

Now, fitmultiBM can be somewhat difficult to optimize – but does not run multiple optimization iterations by default. It does, on the other hand, use random initial values for optimization (unlike fitMk), so undertaking multiple optimization is as simple as writing a for loop! (One might think we could use the foreach package to parallelize multiple optimization iterations across cores as I have shown elsewhere on this blog; however, we cannot because we are parallelizing computation of the likelihood in each optimization iteration already!)

fits<-list()
niter<-10
for(i in 1:niter){
  capture.output(
    fits[[i]]<-fitmultiBM(grunter_tree,pc1,diet,
      model=model,parallel=TRUE,levs=100)
  )
}

(Calling capture.output simply has the effect of suppressing a print-out of our design matrix for every iteration of optimization! If we don’t care about that, we can simply remove it.)

Now let’s get the log-likelihoods for each fitted model.

lnL<-sapply(fits,logLik)
lnL
##  [1] -61.04035 -61.04035 -61.04035 -61.04035 -61.04035 -61.04035 -61.04035 -61.04035 -61.04035
## [10] -61.04035

Depending on our fortune, we may find that these are all the same – or that some are higher than others.

We can pull out the best-fitting model from the set as follows.

best_multirate<-fits[[which.max(lnL)[1]]]

(The simple reason I use which.max(...)[1] instead of which.max(...) is in case we’ve converged on the same solution with precisely the same log-likelihood in more than one optimization iteration. Indeed, we should hope that this is the case!)

best_multirate
## Object of class "fitmultiBM" based on
##     a discretization with k = 100 levels.
## 
## Fitted multi-rate BM model parameters:
##  levels: [ carnivore, herbivore, omnivore ]
##   sigsq: [ 0.0035, 0.0293, 0.1598 ]
##      x0: -0.406 
## 
## Estimated Q matrix:
##             carnivore   herbivore    omnivore
## carnivore -0.03639381  0.00000000  0.03639381
## herbivore  0.00000000 -0.05789971  0.05789971
## omnivore   0.07757370  0.05445375 -0.13202745
## 
## Log-likelihood: -61.0404 
## 
## R thinks it has found the ML solution.

For this experiment, most likely we’ve found that this fitted model is the same as the one we obtained earlier. Just in case it’s better let’s check & replace it if that’s the case.

pc1_multirate<-
  if(logLik(best_multirate)>logLik(pc1_multirate)) best_multirate else
    pc1_multirate

Quite often, and for more complex models and larger trees, it will take multiple optimization iterations to converge on the correct ML solution!

Lastly, at least for now, let’s compare this fitted model to our standard null model in which the rate of evolution of pc1 is the same for each of the three levels of our discrete trait.

pc1_nullmodel<-fitmultiBM(grunter_tree,pc1,diet,
  model=model,parallel=TRUE,levs=100,plot_model=TRUE,
  null_model=TRUE)
## This is the design of your custom discrete-trait model:
##           carnivore herbivore omnivore
## carnivore         0         0        3
## herbivore         0         0        4
## omnivore          1         2        0

plot of chunk unnamed-chunk-30

(The difference between this model structure and our multi-rate model is that the super- and sub-diagonals no longer differ by discrete state. Do you see it?)
pc1_nullmodel
## Object of class "fitmultiBM" based on
##     a discretization with k = 100 levels.
## 
## Fitted multi-rate BM model parameters:
##  levels: [ carnivore, herbivore, omnivore ]
##   sigsq: [ 0.0408, 0.0408, 0.0408 ]
##      x0: -0.344 
## 
## Estimated Q matrix:
##             carnivore    herbivore    omnivore
## carnivore -0.02005748  0.000000000  0.02005748
## herbivore  0.00000000 -0.067402676  0.06740268
## omnivore   0.10521478  0.008055272 -0.11327005
## 
## Log-likelihood: -67.9894 
## 
## R thinks it has found the ML solution.

We can compare these two models, one against the other as follows.

anova(pc1_nullmodel,pc1_multirate)
##                  log(L) d.f.      AIC      weight
## pc1_nullmodel -67.98944    6 147.9789 0.007039976
## pc1_multirate -61.04035    8 138.0807 0.992960024

This tells us that there’s a lot more support in these data for a scenario in which \(\sigma^2\) of our continuous trait varies as a function of our discrete trait than for one in which the rate is constant throughout the tree.

Note that this null model is the same and should (asympotically, as we increase levs) have the same likelihood, as a standard Mk and constant rate Brownian model for our discrete and continuous traits, respectively.

Let’s see that this is the case.

mk_fit<-fitMk(grunter_tree,diet,model=model,
  pi="mle")
mk_fit
## Object of class "fitMk".
## 
## Fitted (or set) value of Q:
##           carnivore herbivore  omnivore
## carnivore -0.020063  0.000000  0.020063
## herbivore  0.000000 -0.067403  0.067403
## omnivore   0.105219  0.008056 -0.113275
## 
## Fitted (or set) value of pi:
## carnivore herbivore  omnivore 
##         0         1         0 
## due to treating the root prior as (a) it's MLE.
## 
## Log-likelihood: -32.16097 
## 
## Optimization method used was "nlminb"
## 
## R thinks it has found the ML solution.
bm_fit<-geiger::fitContinuous(grunter_tree,pc1)
bm_fit
## GEIGER-fitted comparative model of continuous data
##  fitted 'BM' model parameters:
## 	sigsq = 0.044206
## 	z0 = -0.360878
## 
##  model summary:
## 	log-likelihood = -36.372527
## 	AIC = 76.745055
## 	AICc = 77.087912
## 	free parameters = 2
## 
## Convergence diagnostics:
## 	optimization iterations = 100
## 	failed iterations = 0
## 	number of iterations with same best fit = 100
## 	frequency of best fit = 1.000
## 
##  object summary:
## 	'lik' -- likelihood function
## 	'bnd' -- bounds for likelihood search
## 	'res' -- optimization iteration summary
## 	'opt' -- maximum likelihood parameter estimates

We should see that the parameter estimates are almost exactly the same as in our pc1_nullmodel object. Now let's check the likelihood.

## total likelihood
logLik(mk_fit)+logLik(bm_fit)
## [1] -68.5335
## attr(,"df")
## [1] 4

To the extent that this log-likelihood is not the same as we obtained for our null model using fitmultiBM, this difference should asymptotically disappear as we increase levs for the discrete approximation. This can get a bit computationally heavy, but let’s just try levs=300 and see what happens.

fitmultiBM(grunter_tree,pc1,diet,
  model=model,parallel=TRUE,levs=300,
  null_model=TRUE)
## This is the design of your custom discrete-trait model:
##           carnivore herbivore omnivore
## carnivore         0         0        3
## herbivore         0         0        4
## omnivore          1         2        0
## Object of class "fitmultiBM" based on
##     a discretization with k = 300 levels.
## 
## Fitted multi-rate BM model parameters:
##  levels: [ carnivore, herbivore, omnivore ]
##   sigsq: [ 0.0436, 0.0436, 0.0436 ]
##      x0: -0.3647 
## 
## Estimated Q matrix:
##             carnivore    herbivore    omnivore
## carnivore -0.02006333  0.000000000  0.02006333
## herbivore  0.00000000 -0.067402967  0.06740297
## omnivore   0.10521778  0.008056971 -0.11327475
## 
## Log-likelihood: -68.3704 
## 
## R thinks it has found the ML solution.

Now they're even closer, and this trend will continue as we increase levs. As a valuable sidenote, even with levs=300 I left this all day (no exaggeration) to run!

Anyway, this is neat. I’ll leave it here for now, but will definitely be circling back to this method and function soon! Follow this space if you’re interested.