Today a phytools user & R-sig-phylo list participant asked:
“I am performing analyses using threshBayes
from the phytools package. I was wondering if someone has ever extracted the highest posterior density intervals (HPD) values using this function. How could I make this?”
As a quick reminder, the phytools function threshBayes
implements a Bayesian MCMC version of Felsenstein’s (2005; 2012) threshold model for estimating the evolutionary correlation between binary character (or between a discrete & binary trait) under a phylogenetic comparative interpretation of Wright’s threshold model from quantitative genetics.
This question is a bit of a softball, since we provided an exact workflow to do precisely this in Chapter 7 of our recent Princeton University Press book; however, not everyone has the book (hard to believe, I know), so I thought it was worth reiterating here!
To illustrate this workflow I’m going to use an excellent dataset of parity mode (i.e., oviparity vs. viviparity) & environmental temperature in South American liolaemid lizards from Esquerré et al. (2019).
We can start by loading phytools.
library(phytools)
packageVersion("phytools")
## [1] '1.9.25'
Now let’s add our data & tree as follows.
data("liolaemid.data")
head(liolaemid.data)
## parity_mode max_altitude temperature
## Ctenoblepharys_adspersa O 750 23.05
## Liolaemus_abaucan O 2600 20.20
## Liolaemus_albiceps V 4020 12.38
## Liolaemus_andinus V 4900 11.40
## Liolaemus_annectens V 4688 5.10
## Liolaemus_anomalus O 1400 23.78
data("liolaemid.tree")
liolaemid.tree
##
## Phylogenetic tree with 257 tips and 256 internal nodes.
##
## Tip labels:
## Liolaemus_abaucan, Liolaemus_koslowskyi, Liolaemus_albiceps, Liolaemus_irregularis, Liolaemus_ornatus, Liolaemus_calchaqui, ...
##
## Rooted; includes branch lengths.
plotTree(liolaemid.tree,ftype="i",fsize=0.4,type="arc",
lwd=1,arc_height=0.5)
There is one taxon in the dataset that’s missing from our tree.
chk<-geiger::name.check(liolaemid.tree,liolaemid.data)
chk
## $tree_not_data
## character(0)
##
## $data_not_tree
## [1] "Ctenoblepharys_adspersa"
liolaemid.data<-liolaemid.data[liolaemid.tree$tip.label,]
We’re already ready to run our MCMC!
mcmc_liol<-threshBayes(liolaemid.tree,liolaemid.data[,c(1,3)],
type=c("disc","cont"),ngen=500000,plot=FALSE,
control=list(print.interval=50000))
## Starting MCMC....
## generation: 50000; mean acceptance rate: 0.19
## generation: 100000; mean acceptance rate: 0.21
## generation: 150000; mean acceptance rate: 0.2
## generation: 200000; mean acceptance rate: 0.19
## generation: 250000; mean acceptance rate: 0.32
## generation: 300000; mean acceptance rate: 0.41
## generation: 350000; mean acceptance rate: 0.25
## generation: 400000; mean acceptance rate: 0.3
## generation: 450000; mean acceptance rate: 0.23
## generation: 500000; mean acceptance rate: 0.28
## Done MCMC.
Here we’ve run our MCMC for 500,000 generations. In practice, I’d recommend 1 million or more.
plot(mcmc_liol)
To be cautious, let’s pull out the just the latter half of this MCMC.
r.mcmc<-tail(mcmc_liol$par$r,0.5*nrow(mcmc_liol$par))
We’ll assign this vector the class attribute "mcmc"
and then is we use coda::HPDinterval
we should obtain a 95% high probability density interval on the correlation coefficient, our parameter of interest here.
class(r.mcmc)<-"mcmc"
hpd.liol_r<-coda::HPDinterval(r.mcmc)
hpd.liol_r
## lower upper
## var1 -0.7523365 -0.4540421
## attr(,"Probability")
## [1] 0.95002
Due to the ordination of our character 0 <-> V
, a negative evolutionary correlation implies that species with cooler environmental temperatures should be more likely to evolve towards viviparity (and the converse).
Lastly, let’s graph our posterior density & HPD interval for r.
plot(density(mcmc_liol,burnin=250000),bty="n",cex.lab=0.9,cex.axis=0.7)
h<-0-par()$usr[3]
lines(x=hpd.liol_r,y=rep(-h/2,2))
lines(x=rep(hpd.liol_r[1],2),y=c(-0.3,-0.7)*h)
lines(x=rep(hpd.liol_r[2],2),y=c(-0.3,-0.7)*h)
That pretty much comports with what we showed in our book!