Phylogenetic Tools for Comparative Biology: Ancestral character estimation under the hidden-rates model using phytools

phytools contains an implementation of the hidden-rates model of Beaulieu et al. (2013). (This method is also implemented in the excellent corHMM R package, but with some differences.)

Under the hidden-rates model, for any observed condition (e.g., 0/1) of our character, there can be two or more “hidden” states (e.g., 0 and 0* or 1 and 1*). In turn, the rate of evolution to other observed states (and hidden conditions) can depend on these hidden character values.

Some time ago I implemented the hidden-rates-model in the phytools function fitHRM. This function is similar to corHMM, but with some differences. For instance, I implemented a version of the hidden rates model (which I dubbed the ‘umbral’ model) in which trait evolution cannot occur between hidden states of the character. (E.g., evolution might only proceed 0* ⇄ 0 ⇄ 1 ⇄ 1*.) The idea of this specific design was to capture a property of the threshold model (hence the name, which comes from the Spanish word for “threshold”) in which evolution sometimes takes the trait into conditions which can be hard to escape.

In addition, the full hidden-rates model of phytools::fitHRM (for a given number of rate categories per state) also differs slightly from that implemented in corHMM in that fitHRM treats 0* and 1* as special levels of character state 0 & 1 such that the transitions 0 → 0* and 1 → 1* need not occur at the same rate, even though both involve “turning on” character level *. corHMM, by contrast, treats both changes as equivalent, and thus treats the presence or absence of * as a separate (but hidden) trait. (This latter model can also be specified in fitHRM by setting corHMM_model=TRUE.)

In this light, I thought it could be of some interest to demonstrate possible workflows of fitHRM in phytools. As a sidenote, I’ll add that in my limited experience corHMM is significantly faster in optimization than fitHRM of phytools for the same model (although it does not support parallelization in Windows).

Previously, I’ve illustrated this method using a dataset of parity mode (viviparity vs. oviparity) in liolaemid reptiles from Esquerré et al. (2018), and I’ll re-use that example here. We can start by loading our tree and dataset as follows.

library(phytools)
liolaemid.tree<-read.nexus(
  file="http://www.phytools.org/Rbook/7/Liolaemidae.MCC.nex")
liolaemid.tree

## 
## Phylogenetic tree with 258 tips and 257 internal nodes.
## 
## Tip labels:
##   Ctenoblepharys_adspersa, Liolaemus_abaucan, Liolaemus_koslowskyi, Liolaemus_albiceps, Liolaemus_irregularis, Liolaemus_ornatus, ...
## 
## Rooted; includes branch lengths.

liolaemid.data<-read.csv(
  file="http://www.phytools.org/Rbook/7/Liolaemidae.data.csv",
  row.names=1,stringsAsFactors=TRUE)
parity_mode<-setNames(liolaemid.data$parity_mode,
  rownames(liolaemid.data))
levels(parity_mode)<-c("oviparous","viviparous")
head(parity_mode)

## Ctenoblepharys_adspersa       Liolaemus_abaucan      Liolaemus_albiceps       Liolaemus_andinus 
##               oviparous               oviparous              viviparous              viviparous 
##     Liolaemus_annectens      Liolaemus_anomalus 
##              viviparous               oviparous 
## Levels: oviparous viviparous

Next, I’ll fit a total of four character evolution models.

First, I’ll fit a standard Mk model with no hidden states. I’ll do it using fitHRM – but I would’ve obtained the same result using fitMk.

mk_parity<-fitHRM(liolaemid.tree,parity_mode,
  ncat=1,model="ARD",niter=5,pi="fitzjohn")

## 
## This is the design matrix of the fitted model.
## Does it make sense?
## 
##            oviparous viviparous
## oviparous          0          1
## viviparous         2          0
## 
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -65.9463 ---

mk_parity

## Object of class "fitHRM".
## 
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 1, 1 ]
## 
## Fitted (or set) value of Q:
##            oviparous viviparous
## oviparous  -0.033819   0.033819
## viviparous  0.000000   0.000000
## 
## Fitted (or set) value of pi:
##  oviparous viviparous 
##          1          0 
## due to treating the root prior as (a) nuisance.
## 
## Log-likelihood: -65.946269 
## 
## Optimization method used was "nlminb"

Next, I’ll proceed to fit my so-called “umbral” model. In theory, this model could have multiple, hidden states of each observed character conditions – although I’m not aware of research indicating whether or not we should expect these models to be identifiable. Perhaps more interestingly (for some cases) we can have a hidden condition of one character, but not the other. Just to take parity mode as an example, perhaps viviparity has two unoberved levels: viviparity with the capacity to revert to oviparity; and viviparity without this evolutionary lability – but oviparity (say) may have equal chances of evolving to viviparity in all oviparous lineages. If so, this would correspond to an “umbral” model but with a hidden state only for viviparity. For now, we’ll allow both conditions of our trait to have a hidden level by setting ncat=2.

Fitting a hidden-rates model is a hard computational problem. If we have multiple computer cores to work with (and most everybody does these days), two options that we can try are opt.method="optimParallel", which uses the CRAN package optimParallel to parallelize optimization – or we can use parallel=TRUE, which uses the package foreach to split our different iterations of optimization across cores. For our hidden-rates model, it doesn’t seem to make much difference in total computation time; however, as a general rule, opt.method="optimParallel" should be preferred for higher dimensional optimizations – whereas parallel=TRUE (and, thus, foreach) may work better for multiple iterations of a lower dimensional problem.

(Importantly, because of the way that R parallelizes – by creating multiple instances and then communicating over sockets – we cannot do both parallel=TRUE and opt.method="optimParallel" at once!)

umbral_parity<-fitHRM(liolaemid.tree,parity_mode,
  ncat=2,model="ARD",niter=10,pi="fitzjohn",
  umbral=TRUE,parallel=TRUE,ncores=10)

## 
## This is the design matrix of the fitted model.
## Does it make sense?
## 
##             oviparous oviparous* viviparous viviparous*
## oviparous           0          1          2           0
## oviparous*          3          0          0           0
## viviparous          4          0          0           5
## viviparous*         0          0          6           0
## 
## Opened cluster with 10 cores.
## Running optimization iterations in parallel.
## Please wait....

umbral_parity

## Object of class "fitHRM".
## 
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
## 
## Fitted (or set) value of Q:
##             oviparous oviparous* viviparous viviparous*
## oviparous   -1.971168   0.039879   1.931289    0.000000
## oviparous*   0.037896  -0.037896   0.000000    0.000000
## viviparous   2.697094   0.000000  -3.111548    0.414454
## viviparous*  0.000000   0.000000   0.000000    0.000000
## 
## Fitted (or set) value of pi:
##   oviparous  oviparous*  viviparous viviparous* 
##    0.034061    0.939412    0.026527    0.000000 
## due to treating the root prior as (a) nuisance.
## 
## Log-likelihood: -60.692435 
## 
## Optimization method used was "nlminb"

(Here I chose parallel=TRUE and made a cluster of 10 cores. We can create a cluster containing up to as many cores as we have in our machine – but it doesn’t really make sense here to use more cores than we have iterations.)

Now I’ll fit a different hidden-rates model but in which transitions are allowed to occur between hidden levels of our character. Here, I’ll structure my model as Beaulieu et al. do in the corHMM package and treat my * condition as if it’s a separate, unobserved character evolving on the tree – so a change (say) from 0* to 0 (loss of star) occurs with the same rate as a change from 1* to 1 (etc.).

To specify that I want fitHRM to fit this particular version of the hidden-rates model, I set the optional argument corHMM_model=TRUE. In this case I know (from experience!) that this model is a bit difficult for our function to optimize, so I’m going to set the argument niter=20 which doubles the number of optimization iterations that we have run compared to what we did for our prior models. (Since I’m increasing niter, I may as well set ncores to the number I have in my machine: 16.)

corhmm_parity<-fitHRM(liolaemid.tree,parity_mode,
  ncat=2,model="ARD",niter=20,pi="fitzjohn",
  corHMM_model=TRUE,parallel=TRUE,ncores=16)

## 
## This is the design matrix of the fitted model.
## Does it make sense?
## 
##             oviparous oviparous* viviparous viviparous*
## oviparous           0          1          2           0
## oviparous*          3          0          0           4
## viviparous          5          0          0           1
## viviparous*         0          6          3           0
## 
## Opened cluster with 16 cores.
## Running optimization iterations in parallel.
## Please wait....

corhmm_parity

## Object of class "fitHRM".
## 
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
## 
## Fitted (or set) value of Q:
##             oviparous oviparous* viviparous viviparous*
## oviparous   -0.254639   0.032053   0.222586    0.000000
## oviparous*   0.000000  -0.267649   0.000000    0.267649
## viviparous   4.495877   0.000000  -4.527930    0.032053
## viviparous*  0.000000   0.000000   0.000000    0.000000
## 
## Fitted (or set) value of pi:
##   oviparous  oviparous*  viviparous viviparous* 
##         0.5         0.0         0.5         0.0 
## due to treating the root prior as (a) nuisance.
## 
## Log-likelihood: -61.123976 
## 
## Optimization method used was "optim"

Lastly, I’ll fit a “full” discrete character hidden-rates model with two rate categories for each observed state, as follows. (Note I changed the optimization method and increased niter further still.)

hrm_parity<-fitHRM(liolaemid.tree,parity_mode,
  ncat=2,model="ARD",niter=30,pi="fitzjohn",
  opt.method="optimParallel")

## 
## This is the design matrix of the fitted model.
## Does it make sense?
## 
##             oviparous oviparous* viviparous viviparous*
## oviparous           0          1          2           0
## oviparous*          3          0          0           4
## viviparous          5          0          0           6
## viviparous*         0          7          8           0
## 
## log-likelihood from current iteration: -62.4482 
##  --- Best log-likelihood so far: -62.4482 ---
## log-likelihood from current iteration: -60.6925 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.4481 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9468 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.361 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.947 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.9123 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.361 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9462 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.9119 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.7014 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9468 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9471 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9463 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -64.0826 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9469 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9464 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9464 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.7014 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -60.7148 
##  --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.912 
##  --- Best log-likelihood so far: -60.6925 ---

hrm_parity

## Object of class "fitHRM".
## 
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
## 
## Fitted (or set) value of Q:
##             oviparous oviparous* viviparous viviparous*
## oviparous   -1.971232   0.039862   1.931370    0.000000
## oviparous*   0.037895  -0.037895   0.000000    0.000000
## viviparous   2.697059   0.000000  -3.111516    0.414456
## viviparous*  0.000000   0.000000   0.000000    0.000000
## 
## Fitted (or set) value of pi:
##   oviparous  oviparous*  viviparous viviparous* 
##    0.034037    0.939455    0.026509    0.000000 
## due to treating the root prior as (a) nuisance.
## 
## Log-likelihood: -60.692462 
## 
## Optimization method used was "optimParallel"

(Astute readers might notice that this model has almost exactly the same log-likelihood as our “umbral” model! Later it should become clear why that is.)

We can compare among each of these four models using an anova generic method.

aov_parity<-anova(mk_parity,umbral_parity,
  corhmm_parity,hrm_parity)

##                  log(L) d.f.      AIC     weight
## object        -65.94627    2 135.8925 0.13786193
## umbral_parity -60.69243    6 133.3849 0.48303419
## corhmm_parity -61.12398    6 134.2480 0.31373410
## hrm_parity    -60.69246    8 137.3849 0.06536978

This shows us that our best supported model is the umbral model, but with some weight of evidence falling on each of the other three.

par(mfrow=c(2,2))
plot(mk_parity,spacer=0.4,asp=0.7,offset=0.05,
  signif=4)
mtext("a) simple Mk model", adj=0.1)
plot(umbral_parity,spacer=0.4,asp=0.7,offset=0.05,
  signif=4)
mtext("b) umbral model", adj=0.1)
plot(corhmm_parity,spacer=0.4,asp=0.5,offset=0.05,
  signif=4)
mtext("c) corHMM hidden-rates model", adj=0.1)
plot(hrm_parity,spacer=0.4,asp=0.5,offset=0.05,
  signif=4)
mtext("d) full hidden-rates model",adj=0.1)

plot of chunk unnamed-chunk-11

Plotting our four models helps reveal why the umbral model and the full hidden-rates model had the same likelihoods – since the MLEs for the transition rates between viviparous* ⇄ oviparous* are both estimated to be zero, they are the same model! (Note that even if we swapped viviparous for viviparous* and oviparous for oviparous*, this would also be the same model. Get it?)

Lastly, we can compute our model-averaged ancestral states.

This is a bit treacherous for hidden-rate models, because (except for in the case of umbral=TRUE) there’s no guarantee that, for example, oviparous* is the “cool” condition of oviparity, and oviparous (no *) the “warm” state. (The opposite could be true.)

From our prior figure, though, I can see that each * condition has a lower rate of transition to other states, so should rightly be considered homologous with the corresponding unobserved condition in the other models. Under other circumstances, it might be wiser to choose our best model or to compute ancestral states under multiple models and compare the results.

anc_parity<-ancr(aov_parity,tips=TRUE)
anc_parity

## Marginal ancestral state estimates:
##                         oviparous oviparous* viviparous viviparous*
## Ctenoblepharys_adspersa  0.486660   0.513340   0.000000    0.000000
## Liolaemus_abaucan        0.478123   0.521877   0.000000    0.000000
## Liolaemus_koslowskyi     0.478123   0.521877   0.000000    0.000000
## Liolaemus_albiceps       0.000000   0.000000   0.159029    0.840971
## Liolaemus_irregularis    0.000000   0.000000   0.159029    0.840971
## Liolaemus_ornatus        0.000000   0.000000   0.172144    0.827856
## ...
## 
## Log-likelihood = -78.672204

Something that we can do without worry, however, is compute the ancestral node probabilities for each of the observed states of our trait.

This simply involves summing the model-averaged marginal scaled likelihoods across each of the unobserved levels for each observed trait condition. In this case we don’t need to know if the * conditions are hot or cold – it doesn’t matter.

To make this a bit easier, I’ll write a simple custom function for this called hide_hidden that will take care exactly this calculation for us! (I’ll add this to phytools in some form later.)

hide_hidden<-function(object,...){
  ss<-colnames(object$ace)
  ss<-ss[-grep("*",ss,fixed=TRUE)]
  anc<-matrix(0,nrow(object$ace),length(ss),
    dimnames=list(rownames(object$ace),ss))
  for(i in 1:length(ss)){
    anc[,ss[i]]<-rowSums(object$ace[,grep(ss[i],
      colnames(object$ace))])
  }
  anc
}

Let’s apply it!

hidden<-hide_hidden(anc_parity)
plotTree(liolaemid.tree,type="fan",ftype="off",lwd=1,
  part=0.5)
par(fg="transparent")
nodelabels(pie=hidden[1:Nnode(liolaemid.tree)+
  Ntip(liolaemid.tree),],piecol=c("#fffdd0","#DF536B"),
  cex=0.3)
tiplabels(pie=hidden[liolaemid.tree$tip.label,],
  piecol=c("#fffdd0","#DF536B"),
  cex=0.2)
par(fg="black")
legend("topleft",levels(parity_mode),pch=16,
  col=c("#fffdd0","#DF536B"),bty="n",cex=0.8,
  pt.cex=1.5)

plot of chunk unnamed-chunk-14

Cool.

As a brief addendum, here’s how you can fit the corHMM_model=TRUE model of fitHRM using the corHMM package itself (as one might naturally do). We can see that this should give the same result (barring any failure of optimization) as we obtained using fitHRM for corHMM_model=TRUE, above.

set.seed(99)

fit_corhmm<-corHMM::corHMM(liolaemid.tree,
  data.frame(Genus_sp=names(parity_mode),
  parity=as.numeric(parity_mode)),root.p="maddfitz",
  rate.cat=2,model="ARD",
  nstarts=10)

## State distribution in data:
## States:	1	2	
## Counts:	79	179	
## Beginning thorough optimization search -- performing 10 random restarts 
## Finished. Inferring ancestral states using marginal reconstruction.

fit_corhmm

## 
## Fit
##       -lnL     AIC     AICc Rate.cat ntax
##  -61.12348 134.247 134.5816        2  258
## 
## Legend
##   1   2 
## "1" "2" 
## 
## Rates
##             (1,R1)      (2,R1)      (1,R2)     (2,R2)
## (1,R1)          NA 0.270159078 0.032045033         NA
## (2,R1) 5.462929563          NA          NA 0.03204503
## (1,R2) 0.000000001          NA          NA 0.26773349
## (2,R2)          NA 0.000000001 0.000000001         NA
## 
## Arrived at a reliable solution

Phylogenetic Tools for Comparative Biology

Pages

Tuesday, March 28, 2023

Ancestral character estimation under the hidden-rates model using phytools

3 comments: