A friend & colleague asked on Twitter about how stochastic character mapping using
phytools::make.simmap
scales with larger trees.
I've seen a few posts now discussing simmap on medium-large phylo's. Does it scale particularly bad (runtime-wise) with tree size?
— Roi Maor (@Roi_Maor) July 25, 2022
Nearly five years ago, I
posted
about parallelizing make.simmap
using the R package
snow. I thought it might be a good moment to revisit
that post.
For this example, I'll use the same phylogeny & dataset from Ramm et al. (2019) that I used on my blog of yesterday & that also features in an exercise of my upcoming book with Luke Harmon.
Let's load our packages.
library(phytools)
library(geiger)
library(snow)
In this example, I'm going to use the latest phytools version which contains one small fix
to allow an object of class "Qmatrix"
to be passed as a fixed value of Q to make.simmap
.
(Otherwise we need to pass a simple matrix, which can be obtained from an object of class
"Qmatrix"
using object<-unclass(object)
.)
packageVersion("phytools")
## [1] '1.1.3'
Next, we can read our tree and data from file. This time, let's analyze the trait "habitat"
instead of "tail.spines"
.
## now read data from file
lizard.tree<-read.nexus(file="http://www.phytools.org/Rbook/7/lizard_tree.nex")
lizard.data<-read.csv(file="http://www.phytools.org/Rbook/7/lizard_spines.csv",
row.names=1,stringsAsFactors=TRUE)
## run name.check and prune tree to match input data.
chk<-name.check(lizard.tree,lizard.data)
summary(chk)
## 3504 taxa are present in the tree but not the data:
## Ablepharus_chernovi,
## Ablepharus_kitaibelii,
## Ablepharus_pannonicus,
## Abronia_aurita,
## Abronia_campbelli,
## Abronia_chiszari,
## ....
##
## To see complete list of mis-matched taxa, print object.
lizard.tree<-drop.tip(lizard.tree,chk$tree_not_data)
## pull out the environmental trait "habitat"
habitat<-setNames(lizard.data$habitat,
rownames(lizard.data))
For fun, let's plot our data in a fan-style, just as we did yesterday.
plotTree(lizard.tree,type="fan",lwd=1,ftype="off")
obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
n<-Ntip(lizard.tree)
col.hab<-setNames(c("darkgreen","grey"),levels(habitat))
par(lend=3)
for(i in 1:Ntip(lizard.tree)){
cc<-if(obj$xx[i]>0) 5 else -5
th<-atan(obj$yy[i]/obj$xx[i])
segments(obj$xx[i],obj$yy[i],
obj$xx[i]+cc*cos(th),
obj$yy[i]+cc*sin(th),lwd=4,
col=col.hab[habitat[lizard.tree$tip.label[i]]])
}
legend("topleft",legend=levels(habitat),
pch=15,col=col.hab,
pt.cex=1.5,cex=0.8,bty="n")