My package phytools has seen many & diverse changes since it was last updated on CRAN. I'm trying to get a new version up in advance of some workshops that I will be doing this month (at the Evolution meeting) and in July. My second phytools goal for the year is to submit a new manuscript describing the metaphoric 'version 2.0' of the package to reflect the numerous & varied changes to phytools since it's original publication in 2012.
As I work through some of the older package functions, I ran across
bug
report regarding the (mis)behavior of the function add.species.to.genus
on large trees - in which
(evidently) the function at some point, after many taxa have been added iteratively,
spontaneously decides that the tree is no longer ultrametric and drops all edge lengths.
Without having a reproducible example to work from, this is hard to diagnose completely -
but I strongly suspect it stems from using an empirical tree read from a file in which
edge lengths are rounded (as they invariably must be). This can probably be resolved
by using the phytools function force.ultrametric
before running
the function.
As far as I can tell when the tree is ultrametric to the numerical precision of R this is not a problem.
For fun, here is an example in which I will randomly add 100 taxa to a phylogeny of 10 species in 5 genera & animate it by coloring the edges of the tree by the genus that they belong to:
library(phytools)
tree
##
## Phylogenetic tree with 10 tips and 9 internal nodes.
##
## Tip labels:
## Genus1_sp1, Genus1_sp2, Genus2_sp1, Genus2_sp2, Genus2_sp3, Genus3_sp1, ...
##
## Rooted; includes branch lengths.
plotTree(tree,ftype="i")
for(i in 1:length(new.sp)){
tree<-add.species.to.genus(tree,new.sp[i],where="random")
MRCA<-vector()
tips<-lapply(lapply(genera,grep,tree$tip.label),function(x,y)
y[x],y=tree$tip.label)
MRCA<-sapply(tips,function(x,y){
if(length(x)>1) findMRCA(y,x)
else getParent(y,which(y$tip.label==x))
},y=tree)
painted<-tree
for(j in 1:length(MRCA)) painted<-paintSubTree(painted,MRCA[j],j+1)
cols<-setNames(cols<-c("grey",brewer.pal(5,"Accent")),1:6)
dev.hold()
plotTree(tree,ftype="off",lwd=4)
plot(painted,cols,ftype="off",add=TRUE)
dev.flush()
}
(The .gif creation is actually integrated into this workflow using ImageMagick.)
Here is how the original tree & taxa to be added were simulated:
set.seed(1)
tree<-phytools:::lambdaTree(pbtree(n=10),0.3)
tree$tip.label[1:2]<-paste("Genus1_sp",1:2,sep="")
tree$tip.label[3:5]<-paste("Genus2_sp",1:3,sep="")
tree$tip.label[6:7]<-paste("Genus3_sp",1:2,sep="")
tree$tip.label[8]<-"Genus4_sp1"
tree$tip.label[9:10]<-paste("Genus5_sp",1:2,sep="")
genera<-paste("Genus",1:5,sep="")
sp<-sapply(genera,function(x,y) length(grep(x,y)),y=tree$tip.label)
new.sp<-vector()
for(i in 1:100){
ii<-sample(1:5,1)
sp[ii]<-sp[ii]+1
new.sp[i]<-paste(genera[ii],"_sp",sp[ii],sep="")
}
That's all.
Dear Liam,
ReplyDeleteThis blog is incredibly informative, thanks!
I am very new to these phylogenetic analyses and have a doubt. I have a set of 100 trees stored in a .nex file, which opens into a multiphylo object. Such trees are based on a master phylogeny that did not include a species I sampled. I am trying to use the add.species.to.genus tool to include such species in all 100 trees but have not been able to do so. I have tried lapply, as well as for loops, but the tool insists that it can only work with objects of class phylo.
Do you have any suggestions as how this could be solved?
Thanks!