Phylogenetic Tools for Comparative Biology: February 2024

The other week I posted about fitting an ordered Mk model in which the rate of change between adjacent character levels in the model varied as a n-degree polynomial functional form.

Still working out some bugs, but now in #Rstats #phytools -- fitting an ordered Mk model in which the rates of change between adjacent states vary according to an n-degree polynomial functional form: https://t.co/DcS0GFtALg. pic.twitter.com/FhlTQsASCR
— Liam Revell (@phytools_liam) February 21, 2024

This model is very cool (IMO); however, in practice it can be quite difficult to fit to data. I’m not sure why this is, but I suspect it has to do with the fact that the functional form can easily get the Q matrix into parts of parameter space where the likelihood is not defined.

I was searching around for a robust optimization routine in R that might work for “holey” likelihood surfaces that were not numerically differentiable & I stumbled onto DEoptim. DEoptim uses the differential evolution algorithm described in Storn & Price (1997; also see Mullen et al. 2011).

Differential evolution is a type of evolutionary algorithm in which the principles of natural selection are used to try to solve a complex problem – in this case, optimization of a real-valued function. The algorithm first creates a population of “individuals” consisting of randomly chosen values of the parameters to be optimized. Each generation, this population is subject to selection (based on the objective function: in our case, the likelihood function), mutation, and crossing-over. Differential evolution is robust, but inefficient for smooth objective functions. So far, I’ve found that it seems quite good at converging to the same or a better solution that I’m able to find using nlminb or optim.

To illustrate it’s use, I’m going to use some real empirical data for a (potentially) ordered character trait: caudal vertebrae number in Anolis. These data are from an article I have in review that was co-authored by Klaus Schliep, Travis Ingram, & Luke Mahler. (These data are available from a project GitHub repository here.)

To start with, I’ll load phytools – which should be updated from GitHub.

library(phytools)
packageVersion("phytools")

## [1] '2.2.0'

Next, we can read in our dataset and tree.

anolis_dat<-read.csv(file="anolis_caudal_vertebrae_data.csv",
  row.names=1)
head(anolis_dat)

anolis_tre<-read.tree(file="mcc.tre")
anolis_tre

## 
## Phylogenetic tree with 187 tips and 186 internal nodes.
## 
## Tip labels:
##   punctatus, transversalis, agassizi, casildae, microtus, aqbl, ...
## 
## Rooted; includes branch lengths.

There are some discrepancies between the data and tree, so I’ll use geiger::name.check to identify reconcile them.

chk<-geiger::name.check(anolis_tre,anolis_dat,data.names=anolis_dat$Species)
summary(chk)

## 76 taxa are present in the tree but not the data:
##     acutus,
##     aeneus,
##     alayoni,
##     alfaroi,
##     altae,
##     altitudinalis,
##     ....
## 
## To see complete list of mis-matched taxa, print object.

anolis_tre.pruned<-drop.tip(anolis_tre,chk$tree_not_data)
vert_number<-setNames(anolis_dat$Vert,anolis_dat$Species)

Finally, we’re ready to fit our model.

Here, I’m going to use fitfnMk with opt.method="DEoptim" and degree=0 to indicate that I want to fit a model with constant (i.e., degree 0 polynomial) backward & forward transition rates between adjacent levels. I could’ve also fit this model easily using phytools::fitMk or geiger::fitDiscrete. I’m using a modest number of iterations of maxit=100, simply because I expect thsi model will be straightforward to optimize.

anolis_fit0<-fitfnMk(anolis_tre.pruned,vert_number,
  degree=0,niter=1,opt.method="DEoptim",parallel=TRUE,
  maxit=100,trace=10,prompt=FALSE)

## 
## Iteration: 10 bestvalit: 303.445488 bestmemit:    0.129787    0.215553
## Iteration: 20 bestvalit: 303.441723 bestmemit:    0.130566    0.217604
## Iteration: 30 bestvalit: 303.441716 bestmemit:    0.130487    0.217581
## Iteration: 40 bestvalit: 303.441716 bestmemit:    0.130491    0.217592
## Iteration: 50 bestvalit: 303.441716 bestmemit:    0.130492    0.217592
## Iteration: 60 bestvalit: 303.441716 bestmemit:    0.130491    0.217592
## Iteration: 70 bestvalit: 303.441716 bestmemit:    0.130491    0.217592
## Iteration: 80 bestvalit: 303.441716 bestmemit:    0.130491    0.217592
## Iteration: 90 bestvalit: 303.441716 bestmemit:    0.130491    0.217592
## Iteration: 100 bestvalit: 303.441716 bestmemit:    0.130491    0.217592

Let’s graph our fitted model and the log-likelihood.

plot(as.Qmatrix(anolis_fit0),xlim=c(-1.6,1),ylim=c(-0.9,0.9),
  show.zeros=FALSE,width=TRUE,color=TRUE,text=FALSE,max.lwd=4)
mtext(paste("log(L) =",round(logLik(anolis_fit0),2)),
  line=2,adj=0.1,cex=1.5)

plot of chunk unnamed-chunk-49

Terrific. Now, let’s do the same for a slightly more interesting model – this one a first degree polynomial, i.e., “straight line.”

anolis_fit1<-fitfnMk(anolis_tre.pruned,vert_number,
  degree=1,niter=1,opt.method="DEoptim",parallel=TRUE,
  maxit=200,trace=20,prompt=FALSE)

## 
## Iteration: 20 bestvalit: 303.594425 bestmemit:   -0.001227    0.167693    0.000076    0.236956
## Iteration: 40 bestvalit: 303.181487 bestmemit:   -0.006270    0.207564   -0.006498    0.294802
## Iteration: 60 bestvalit: 303.176395 bestmemit:   -0.006165    0.202675   -0.006765    0.295882
## Iteration: 80 bestvalit: 303.176297 bestmemit:   -0.006164    0.202479   -0.006795    0.295932
## Iteration: 100 bestvalit: 303.176295 bestmemit:   -0.006169    0.202606   -0.006796    0.296011
## Iteration: 120 bestvalit: 303.176295 bestmemit:   -0.006169    0.202600   -0.006795    0.296001
## Iteration: 140 bestvalit: 303.176295 bestmemit:   -0.006169    0.202599   -0.006795    0.296001
## Iteration: 160 bestvalit: 303.176295 bestmemit:   -0.006169    0.202599   -0.006795    0.296001
## Iteration: 180 bestvalit: 303.176295 bestmemit:   -0.006169    0.202599   -0.006795    0.296001
## Iteration: 200 bestvalit: 303.176295 bestmemit:   -0.006169    0.202599   -0.006795    0.296001

This time, instead of just plotting the fitted Q matrix, let’s plot both the Q matrix and the function form. This will look as follows.

layout(matrix(c(1,2),1,2),widths=c(0.45,0.55))
plot(as.Qmatrix(anolis_fit1),width=TRUE,color=TRUE,
  show.zeros=FALSE,xlim=c(-1.6,1),ylim=c(-0.9,0.9),
  text=FALSE,max.lwd=4)
mtext(paste("log(L) =",round(logLik(anolis_fit1),2)),
  line=2,adj=0.1,cex=1.5)
par(mar=c(3.1,4.1,1.1,1.1))
plot(anolis_fit1)

plot of chunk unnamed-chunk-51

anolis_fit2<-fitfnMk(anolis_tre.pruned,vert_number,
  degree=2,niter=1,opt.method="DEoptim",parallel=TRUE,
  maxit=400,trace=40,prompt=FALSE)

## 
## Iteration: 40 bestvalit: 313.032922 bestmemit:    0.007247   -0.049616    0.154917    0.014953   -0.157637    0.311038
## Iteration: 80 bestvalit: 302.032363 bestmemit:    0.002105   -0.029666    0.310493    0.004510   -0.069509    0.462582
## Iteration: 120 bestvalit: 298.087849 bestmemit:   -0.003116    0.101355   -0.598268    0.001908   -0.022558    0.194169
## Iteration: 160 bestvalit: 297.922071 bestmemit:   -0.002958    0.095977   -0.538803    0.002085   -0.023139    0.184739
## Iteration: 200 bestvalit: 297.920032 bestmemit:   -0.002985    0.097075   -0.547527    0.002109   -0.023431    0.185733
## Iteration: 240 bestvalit: 297.920013 bestmemit:   -0.002987    0.097064   -0.547306    0.002102   -0.023354    0.185633
## Iteration: 280 bestvalit: 297.920013 bestmemit:   -0.002987    0.097054   -0.547246    0.002101   -0.023351    0.185630
## Iteration: 320 bestvalit: 297.920013 bestmemit:   -0.002987    0.097053   -0.547236    0.002101   -0.023351    0.185630
## Iteration: 360 bestvalit: 297.920013 bestmemit:   -0.002987    0.097053   -0.547237    0.002101   -0.023351    0.185630
## Iteration: 400 bestvalit: 297.920013 bestmemit:   -0.002987    0.097053   -0.547237    0.002101   -0.023351    0.185630

layout(matrix(c(1,2),1,2),widths=c(0.45,0.55))
plot(as.Qmatrix(anolis_fit2),width=TRUE,color=TRUE,
  show.zeros=FALSE,xlim=c(-1.6,1),ylim=c(-0.9,0.9),
  text=FALSE,max.lwd=4)
mtext(paste("log(L) =",round(logLik(anolis_fit2),2)),
  line=2,adj=0.1,cex=1.5)
par(mar=c(3.1,4.1,1.1,1.1))
plot(anolis_fit2)

plot of chunk unnamed-chunk-53

anolis_fit3<-fitfnMk(anolis_tre.pruned,vert_number,
  degree=3,niter=1,opt.method="DEoptim",parallel=TRUE,
  maxit=800,trace=80,prompt=FALSE)

## 
## Iteration: 80 bestvalit: 323.114084 bestmemit:   -0.110058    2.594987    0.095398   -0.658322   -0.023410    1.320606    2.973144   -1.786844
## Iteration: 160 bestvalit: 322.561268 bestmemit:   -0.046732    1.078965   -0.505863    0.012502   -0.018492    0.666359    0.283562   -0.283691
## Iteration: 240 bestvalit: 316.330584 bestmemit:   -0.007572    0.226626   -0.990072    1.418284   -0.004184    0.190996   -1.052335    1.798670
## Iteration: 320 bestvalit: 313.424801 bestmemit:   -0.003558    0.092464   -0.335300    0.466350   -0.000780    0.042978   -0.138476    0.294190
## Iteration: 400 bestvalit: 306.819655 bestmemit:   -0.001681    0.053889   -0.410016    1.077750   -0.000730    0.031338   -0.249775    0.728293
## Iteration: 480 bestvalit: 303.042333 bestmemit:    0.000301    0.002289   -0.115700    0.694865    0.001343   -0.026053    0.108531    0.314291
## Iteration: 560 bestvalit: 297.490606 bestmemit:   -0.000109    0.005058   -0.043357    0.146974    0.000344   -0.005590    0.017234    0.189912
## Iteration: 640 bestvalit: 297.296921 bestmemit:   -0.000019    0.002161   -0.022959    0.112327    0.000420   -0.008811    0.045689    0.140619
## Iteration: 720 bestvalit: 297.287727 bestmemit:   -0.000036    0.002814   -0.030054    0.134835    0.000411   -0.008379    0.040328    0.158209
## Iteration: 800 bestvalit: 297.287689 bestmemit:   -0.000036    0.002815   -0.030075    0.134978    0.000412   -0.008407    0.040547    0.157920

layout(matrix(c(1,2),1,2),widths=c(0.45,0.55))
plot(as.Qmatrix(anolis_fit3),width=TRUE,color=TRUE,
  show.zeros=FALSE,xlim=c(-1.6,1),ylim=c(-0.9,0.9),
  text=FALSE,max.lwd=4)
mtext(paste("log(L) =",round(logLik(anolis_fit3),2)),
  line=2,adj=0.1,cex=1.5)
par(mar=c(3.1,4.1,1.1,1.1))
plot(anolis_fit3)

Readers might have noted that I increased maxit (and trace – but that is just my print interval & will not affect optimization) for each increasingly complex model. There’s no reason to have done it in such a systematic manner, but it does take longer to converge (on average) for more parameter rich models of this type.

Finally, let’s compare our set of models as follows.

anova(anolis_fit0,
  anolis_fit1,
  anolis_fit2,
  anolis_fit3)

##                log(L) d.f.      AIC    weight
## anolis_fit0 -303.4417    2 610.8834 0.1444457
## anolis_fit1 -303.1763    4 614.3526 0.0254910
## anolis_fit2 -297.9200    6 607.8400 0.6615641
## anolis_fit3 -297.2877    8 610.5754 0.1684992

This tells us that the best supported model (accounting for parameter complexity) is actually our second-degree (i.e., quadratic) polynomial model. Cool.

We, of course, can consider higher order polynomial models – but these take even longer to optimize, so I may reserve that for a future post!

That’s it.

Phylogenetic Tools for Comparative Biology

Pages

Wednesday, February 28, 2024

Using differential evolution algorithms to optimize an ordered Mk model where the rate of change between states varies as an n-degree polynomial functional form