Saturday, June 1, 2019

Animation of add.species.to.genus function

My package phytools has seen many & diverse changes since it was last updated on CRAN. I'm trying to get a new version up in advance of some workshops that I will be doing this month (at the Evolution meeting) and in July. My second phytools goal for the year is to submit a new manuscript describing the metaphoric 'version 2.0' of the package to reflect the numerous & varied changes to phytools since it's original publication in 2012.

As I work through some of the older package functions, I ran across bug report regarding the (mis)behavior of the function add.species.to.genus on large trees - in which (evidently) the function at some point, after many taxa have been added iteratively, spontaneously decides that the tree is no longer ultrametric and drops all edge lengths.

Without having a reproducible example to work from, this is hard to diagnose completely - but I strongly suspect it stems from using an empirical tree read from a file in which edge lengths are rounded (as they invariably must be). This can probably be resolved by using the phytools function force.ultrametric before running the function.

As far as I can tell when the tree is ultrametric to the numerical precision of R this is not a problem.

For fun, here is an example in which I will randomly add 100 taxa to a phylogeny of 10 species in 5 genera & animate it by coloring the edges of the tree by the genus that they belong to:

library(phytools)
tree
## 
## Phylogenetic tree with 10 tips and 9 internal nodes.
## 
## Tip labels:
##  Genus1_sp1, Genus1_sp2, Genus2_sp1, Genus2_sp2, Genus2_sp3, Genus3_sp1, ...
## 
## Rooted; includes branch lengths.
plotTree(tree,ftype="i")

plot of chunk unnamed-chunk-1

for(i in 1:length(new.sp)){
    tree<-add.species.to.genus(tree,new.sp[i],where="random")
    MRCA<-vector()
    tips<-lapply(lapply(genera,grep,tree$tip.label),function(x,y)
        y[x],y=tree$tip.label)
    MRCA<-sapply(tips,function(x,y){
        if(length(x)>1) findMRCA(y,x) 
        else getParent(y,which(y$tip.label==x))
        },y=tree)
    painted<-tree
    for(j in 1:length(MRCA)) painted<-paintSubTree(painted,MRCA[j],j+1)
    cols<-setNames(cols<-c("grey",brewer.pal(5,"Accent")),1:6)
    dev.hold()
    plotTree(tree,ftype="off",lwd=4)
    plot(painted,cols,ftype="off",add=TRUE)
    dev.flush()
}

(The .gif creation is actually integrated into this workflow using ImageMagick.)

Here is how the original tree & taxa to be added were simulated:

set.seed(1)
tree<-phytools:::lambdaTree(pbtree(n=10),0.3)
tree$tip.label[1:2]<-paste("Genus1_sp",1:2,sep="")
tree$tip.label[3:5]<-paste("Genus2_sp",1:3,sep="")
tree$tip.label[6:7]<-paste("Genus3_sp",1:2,sep="")
tree$tip.label[8]<-"Genus4_sp1"
tree$tip.label[9:10]<-paste("Genus5_sp",1:2,sep="")
genera<-paste("Genus",1:5,sep="")
sp<-sapply(genera,function(x,y) length(grep(x,y)),y=tree$tip.label)
new.sp<-vector()
for(i in 1:100){
    ii<-sample(1:5,1)
    sp[ii]<-sp[ii]+1
    new.sp[i]<-paste(genera[ii],"_sp",sp[ii],sep="")
}

That's all.

1 comment:

  1. Dear Liam,

    This blog is incredibly informative, thanks!

    I am very new to these phylogenetic analyses and have a doubt. I have a set of 100 trees stored in a .nex file, which opens into a multiphylo object. Such trees are based on a master phylogeny that did not include a species I sampled. I am trying to use the add.species.to.genus tool to include such species in all 100 trees but have not been able to do so. I have tried lapply, as well as for loops, but the tool insists that it can only work with objects of class phylo.

    Do you have any suggestions as how this could be solved?

    Thanks!

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.