*phytools* contains an implementation of the hidden-rates model of Beaulieu et al. (2013). (This method is also implemented in the excellent *corHMM* R package, but with some differences.)

Under the hidden-rates model, for any observed condition (e.g., 0/1) of our character, there can be two or more “hidden” states (e.g., 0 and 0* or 1 and 1*). In turn, the rate of evolution to other observed states (and hidden conditions) can depend on these hidden character values.

Some time ago I implemented the hidden-rates-model in the *phytools* function `fitHRM`

. This function is similar to *corHMM*, but with some differences. For instance, I implemented a version of the hidden rates model (which I dubbed the ‘umbral’ model) in which trait evolution *cannot* occur between hidden states of the character. (E.g., evolution might only proceed 0* ⇄ 0 ⇄ 1 ⇄ 1*.) The idea of this specific design was to capture a property of the threshold model (hence the name, which comes from the Spanish word for “threshold”) in which evolution sometimes takes the trait into conditions which can be hard to escape.

In addition, the *full* hidden-rates model of `phytools::fitHRM`

(for a given number of rate categories per state) also differs slightly from that implemented in *corHMM* in that `fitHRM`

treats 0* and 1* as special levels of character state 0 & 1 such that the transitions 0 → 0* and 1 → 1* need not occur at the same rate, even though both involve “turning on” character level *. *corHMM*, by contrast, treats both changes as equivalent, and thus treats the presence or absence of * as a separate (but hidden) trait. (This latter model can also be specified in `fitHRM`

by setting `corHMM_model=TRUE`

.)

In this light, I thought it could be of some interest to demonstrate possible workflows of `fitHRM`

in *phytools*. As a sidenote, I’ll add that in my limited experience *corHMM* is significantly *faster* in optimization than `fitHRM`

of *phytools* for the same model (although it does not support parallelization in Windows).

Previously, I’ve illustrated this method using a dataset of parity mode (viviparity vs. oviparity) in liolaemid reptiles from EsquerrÃ© et al. (2018), and I’ll re-use that example here. We can start by loading our tree and dataset as follows.

```
library(phytools)
liolaemid.tree<-read.nexus(
file="http://www.phytools.org/Rbook/7/Liolaemidae.MCC.nex")
liolaemid.tree
```

```
##
## Phylogenetic tree with 258 tips and 257 internal nodes.
##
## Tip labels:
## Ctenoblepharys_adspersa, Liolaemus_abaucan, Liolaemus_koslowskyi, Liolaemus_albiceps, Liolaemus_irregularis, Liolaemus_ornatus, ...
##
## Rooted; includes branch lengths.
```

```
liolaemid.data<-read.csv(
file="http://www.phytools.org/Rbook/7/Liolaemidae.data.csv",
row.names=1,stringsAsFactors=TRUE)
parity_mode<-setNames(liolaemid.data$parity_mode,
rownames(liolaemid.data))
levels(parity_mode)<-c("oviparous","viviparous")
head(parity_mode)
```

```
## Ctenoblepharys_adspersa Liolaemus_abaucan Liolaemus_albiceps Liolaemus_andinus
## oviparous oviparous viviparous viviparous
## Liolaemus_annectens Liolaemus_anomalus
## viviparous oviparous
## Levels: oviparous viviparous
```

Next, I’ll fit a total of four character evolution models.

First, I’ll fit a *standard* M*k* model with no hidden states. I’ll do it using `fitHRM`

– but I would’ve obtained the same result using `fitMk`

.

```
mk_parity<-fitHRM(liolaemid.tree,parity_mode,
ncat=1,model="ARD",niter=5,pi="fitzjohn")
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## oviparous viviparous
## oviparous 0 1
## viviparous 2 0
##
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -65.9463 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -65.9463 ---
```

```
mk_parity
```

```
## Object of class "fitHRM".
##
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 1, 1 ]
##
## Fitted (or set) value of Q:
## oviparous viviparous
## oviparous -0.033819 0.033819
## viviparous 0.000000 0.000000
##
## Fitted (or set) value of pi:
## oviparous viviparous
## 1 0
## due to treating the root prior as (a) nuisance.
##
## Log-likelihood: -65.946269
##
## Optimization method used was "nlminb"
```

Next, I’ll proceed to fit my so-called “umbral” model. In theory, this model could have multiple, hidden states of each observed character conditions – although I’m not aware of research indicating whether or not we should expect these models to be identifiable. Perhaps more interestingly (for some cases) we can have a hidden condition of one character, but not the other. Just to take parity mode as an example, perhaps viviparity has two unoberved levels: viviparity with the capacity to revert to oviparity; and viviparity *without* this evolutionary lability – but oviparity (say) may have equal chances of evolving to viviparity in all oviparous lineages. If so, this would correspond to an “umbral” model but with a hidden state *only* for viviparity. For now, we’ll allow both conditions of our trait to have a hidden level by setting `ncat=2`

.

Fitting a hidden-rates model is a hard computational problem. If we have multiple computer cores to work with (and most everybody does these days), two options that we can try are `opt.method="optimParallel"`

, which uses the CRAN package *optimParallel* to parallelize optimization – *or* we can use `parallel=TRUE`

, which uses the package *foreach* to split our different iterations of optimization across cores. For our hidden-rates model, it doesn’t seem to make much difference in total computation time; however, as a general rule, `opt.method="optimParallel"`

should be preferred for higher dimensional optimizations – whereas `parallel=TRUE`

(and, thus, *foreach*) may work better for multiple iterations of a lower dimensional problem.

(Importantly, because of the way that R parallelizes – by creating multiple instances and then communicating over sockets – we *cannot* do both `parallel=TRUE`

and `opt.method="optimParallel"`

at once!)

```
umbral_parity<-fitHRM(liolaemid.tree,parity_mode,
ncat=2,model="ARD",niter=10,pi="fitzjohn",
umbral=TRUE,parallel=TRUE,ncores=10)
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## oviparous oviparous* viviparous viviparous*
## oviparous 0 1 2 0
## oviparous* 3 0 0 0
## viviparous 4 0 0 5
## viviparous* 0 0 6 0
##
## Opened cluster with 10 cores.
## Running optimization iterations in parallel.
## Please wait....
```

```
umbral_parity
```

```
## Object of class "fitHRM".
##
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
##
## Fitted (or set) value of Q:
## oviparous oviparous* viviparous viviparous*
## oviparous -1.971168 0.039879 1.931289 0.000000
## oviparous* 0.037896 -0.037896 0.000000 0.000000
## viviparous 2.697094 0.000000 -3.111548 0.414454
## viviparous* 0.000000 0.000000 0.000000 0.000000
##
## Fitted (or set) value of pi:
## oviparous oviparous* viviparous viviparous*
## 0.034061 0.939412 0.026527 0.000000
## due to treating the root prior as (a) nuisance.
##
## Log-likelihood: -60.692435
##
## Optimization method used was "nlminb"
```

(Here I chose `parallel=TRUE`

and made a cluster of 10 cores. We can create a cluster containing up to as many cores as we have in our machine – but it doesn’t really make sense here to use more cores than we have iterations.)

Now I’ll fit a different hidden-rates model but in which transitions are allowed to occur between hidden levels of our character. Here, I’ll structure my model as Beaulieu et al. do in the *corHMM* package and treat my * condition as if it’s a separate, unobserved character evolving on the tree – so a change (say) from 0* to 0 (loss of star) occurs with the same rate as a change from 1* to 1 (etc.).

To specify that I want `fitHRM`

to fit this particular version of the hidden-rates model, I set the optional argument `corHMM_model=TRUE`

. In this case I know (from experience!) that this model is a bit difficult for our function to optimize, so I’m going to set the argument `niter=20`

which doubles the number of optimization iterations that we have run compared to what we did for our prior models. (Since I’m increasing `niter`

, I may as well set `ncores`

to the number I have in my machine: 16.)

```
corhmm_parity<-fitHRM(liolaemid.tree,parity_mode,
ncat=2,model="ARD",niter=20,pi="fitzjohn",
corHMM_model=TRUE,parallel=TRUE,ncores=16)
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## oviparous oviparous* viviparous viviparous*
## oviparous 0 1 2 0
## oviparous* 3 0 0 4
## viviparous 5 0 0 1
## viviparous* 0 6 3 0
##
## Opened cluster with 16 cores.
## Running optimization iterations in parallel.
## Please wait....
```

```
corhmm_parity
```

```
## Object of class "fitHRM".
##
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
##
## Fitted (or set) value of Q:
## oviparous oviparous* viviparous viviparous*
## oviparous -0.254639 0.032053 0.222586 0.000000
## oviparous* 0.000000 -0.267649 0.000000 0.267649
## viviparous 4.495877 0.000000 -4.527930 0.032053
## viviparous* 0.000000 0.000000 0.000000 0.000000
##
## Fitted (or set) value of pi:
## oviparous oviparous* viviparous viviparous*
## 0.5 0.0 0.5 0.0
## due to treating the root prior as (a) nuisance.
##
## Log-likelihood: -61.123976
##
## Optimization method used was "optim"
```

Lastly, I’ll fit a “full” discrete character hidden-rates model with two rate categories for each observed state, as follows. (Note I changed the optimization method and increased `niter`

further still.)

```
hrm_parity<-fitHRM(liolaemid.tree,parity_mode,
ncat=2,model="ARD",niter=30,pi="fitzjohn",
opt.method="optimParallel")
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## oviparous oviparous* viviparous viviparous*
## oviparous 0 1 2 0
## oviparous* 3 0 0 4
## viviparous 5 0 0 6
## viviparous* 0 7 8 0
##
## log-likelihood from current iteration: -62.4482
## --- Best log-likelihood so far: -62.4482 ---
## log-likelihood from current iteration: -60.6925
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.4481
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9468
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.361
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.947
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.9123
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.361
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9462
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.9119
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.7014
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9468
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9471
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9463
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -64.0826
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9469
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -62.448
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9464
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.9464
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -65.7014
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -60.7148
## --- Best log-likelihood so far: -60.6925 ---
## log-likelihood from current iteration: -63.912
## --- Best log-likelihood so far: -60.6925 ---
```

```
hrm_parity
```

```
## Object of class "fitHRM".
##
## Observed states: [ oviparous, viviparous ]
## Number of rate categories per state: [ 2, 2 ]
##
## Fitted (or set) value of Q:
## oviparous oviparous* viviparous viviparous*
## oviparous -1.971232 0.039862 1.931370 0.000000
## oviparous* 0.037895 -0.037895 0.000000 0.000000
## viviparous 2.697059 0.000000 -3.111516 0.414456
## viviparous* 0.000000 0.000000 0.000000 0.000000
##
## Fitted (or set) value of pi:
## oviparous oviparous* viviparous viviparous*
## 0.034037 0.939455 0.026509 0.000000
## due to treating the root prior as (a) nuisance.
##
## Log-likelihood: -60.692462
##
## Optimization method used was "optimParallel"
```

(Astute readers might notice that this model has almost exactly the *same* log-likelihood as our “umbral” model! Later it should become clear why that is.)

We can compare among each of these four models using an `anova`

generic method.

```
aov_parity<-anova(mk_parity,umbral_parity,
corhmm_parity,hrm_parity)
```

```
## log(L) d.f. AIC weight
## object -65.94627 2 135.8925 0.13786193
## umbral_parity -60.69243 6 133.3849 0.48303419
## corhmm_parity -61.12398 6 134.2480 0.31373410
## hrm_parity -60.69246 8 137.3849 0.06536978
```

This shows us that our best supported model is the umbral model, but with some weight of evidence falling on each of the other three.

```
par(mfrow=c(2,2))
plot(mk_parity,spacer=0.4,asp=0.7,offset=0.05,
signif=4)
mtext("a) simple Mk model", adj=0.1)
plot(umbral_parity,spacer=0.4,asp=0.7,offset=0.05,
signif=4)
mtext("b) umbral model", adj=0.1)
plot(corhmm_parity,spacer=0.4,asp=0.5,offset=0.05,
signif=4)
mtext("c) corHMM hidden-rates model", adj=0.1)
plot(hrm_parity,spacer=0.4,asp=0.5,offset=0.05,
signif=4)
mtext("d) full hidden-rates model",adj=0.1)
```

*Plotting* our four models helps reveal why the umbral model and the full hidden-rates model had the same likelihoods – since the MLEs for the transition rates between `viviparous*`

⇄ `oviparous*`

are both estimated to be zero, they *are* the same model! (Note that even if we swapped `viviparous`

for `viviparous*`

and `oviparous`

for `oviparous*`

, this would *also* be the same model. Get it?)

Lastly, we can compute our model-averaged ancestral states.

This is a bit treacherous for hidden-rate models, because (except for in the case of `umbral=TRUE`

) there’s no guarantee that, for example, `oviparous*`

is the “cool” condition of oviparity, and `oviparous`

(no *) the “warm” state. (The opposite could be true.)

From our prior figure, though, I can see that each * condition has a *lower* rate of transition to other states, so should rightly be considered homologous with the corresponding unobserved condition in the other models. Under other circumstances, it might be wiser to choose our best model or to compute ancestral states under multiple models and compare the results.

```
anc_parity<-ancr(aov_parity,tips=TRUE)
anc_parity
```

```
## Marginal ancestral state estimates:
## oviparous oviparous* viviparous viviparous*
## Ctenoblepharys_adspersa 0.486660 0.513340 0.000000 0.000000
## Liolaemus_abaucan 0.478123 0.521877 0.000000 0.000000
## Liolaemus_koslowskyi 0.478123 0.521877 0.000000 0.000000
## Liolaemus_albiceps 0.000000 0.000000 0.159029 0.840971
## Liolaemus_irregularis 0.000000 0.000000 0.159029 0.840971
## Liolaemus_ornatus 0.000000 0.000000 0.172144 0.827856
## ...
##
## Log-likelihood = -78.672204
```

Something that we *can* do without worry, however, is compute the ancestral node probabilities for each of the *observed* states of our trait.

This simply involves summing the model-averaged marginal scaled likelihoods across each of the unobserved levels for each observed trait condition. In this case we don’t need to know if the * conditions are hot or cold – it doesn’t matter.

To make this a bit easier, I’ll write a simple custom function for this called `hide_hidden`

that will take care exactly this calculation for us! (I’ll add this to *phytools* in some form later.)

```
hide_hidden<-function(object,...){
ss<-colnames(object$ace)
ss<-ss[-grep("*",ss,fixed=TRUE)]
anc<-matrix(0,nrow(object$ace),length(ss),
dimnames=list(rownames(object$ace),ss))
for(i in 1:length(ss)){
anc[,ss[i]]<-rowSums(object$ace[,grep(ss[i],
colnames(object$ace))])
}
anc
}
```

Let’s apply it!

```
hidden<-hide_hidden(anc_parity)
plotTree(liolaemid.tree,type="fan",ftype="off",lwd=1,
part=0.5)
par(fg="transparent")
nodelabels(pie=hidden[1:Nnode(liolaemid.tree)+
Ntip(liolaemid.tree),],piecol=c("#fffdd0","#DF536B"),
cex=0.3)
tiplabels(pie=hidden[liolaemid.tree$tip.label,],
piecol=c("#fffdd0","#DF536B"),
cex=0.2)
par(fg="black")
legend("topleft",levels(parity_mode),pch=16,
col=c("#fffdd0","#DF536B"),bty="n",cex=0.8,
pt.cex=1.5)
```

Cool.

As a brief addendum, here’s how you can fit the `corHMM_model=TRUE`

model of `fitHRM`

using the *corHMM* package itself (as one might naturally do). We can see that this should give the same result (barring any failure of optimization) as we obtained using `fitHRM`

for `corHMM_model=TRUE`

, above.

```
set.seed(99)
```

```
fit_corhmm<-corHMM::corHMM(liolaemid.tree,
data.frame(Genus_sp=names(parity_mode),
parity=as.numeric(parity_mode)),root.p="maddfitz",
rate.cat=2,model="ARD",
nstarts=10)
```

```
## State distribution in data:
## States: 1 2
## Counts: 79 179
## Beginning thorough optimization search -- performing 10 random restarts
## Finished. Inferring ancestral states using marginal reconstruction.
```

```
fit_corhmm
```

```
##
## Fit
## -lnL AIC AICc Rate.cat ntax
## -61.12348 134.247 134.5816 2 258
##
## Legend
## 1 2
## "1" "2"
##
## Rates
## (1,R1) (2,R1) (1,R2) (2,R2)
## (1,R1) NA 0.270159078 0.032045033 NA
## (2,R1) 5.462929563 NA NA 0.03204503
## (1,R2) 0.000000001 NA NA 0.26773349
## (2,R2) NA 0.000000001 0.000000001 NA
##
## Arrived at a reliable solution
```

Hi Liam! Nice post! Would it be possible to run the full hidden-rates model in corHMM with a customized rate.mat?

ReplyDeleteBest

Daniel

Quite possibly!

DeleteNice! Thanks!

Delete