An R-sig-phylo subscriber recently asked if it was straightforward to add colored points to a plotted tree based on Bayesian posterior probabilities or bootstrap values.
The answer, of course, is yes. One can do this using the ape function nodelabels
; however, as Luke Harmon & I show in Chapter 13 of our recently published book, it’s pretty straightforward to script it for ourselves.
Let’s try.
For this example, I’m going to begin by inferring a tree using a dataset ("Laurasiatherian"
), that comes packaged with the phangorn package.
library(phytools)
library(phangorn)
data("Laurasiatherian")
Laurasiatherian
## 47 sequences with 3179 character and 1605 different site patterns.
## The states are a c g t
Now as I do this using phangorn::pml_bb
, approximate bootstrap values are stored as node labels in my object. (The argument, control=pml.control(trace=0)
just turns off the print-out, which can be quite extensive otherwise! In your interactive R session I recommend leaving that off so that you can see the progress of your optimization in real time.)
mammal.pml<-pml_bb(Laurasiatherian,model="GTR+G(4)",
control=pml.control(trace=0))
mammal.pml
## model: GTR+G(4)
## loglikelihood: -44699.66
## unconstrained loglikelihood: -17300.92
## Model of rate heterogeneity: Discrete gamma model
## Number of rate categories: 4
## Shape parameter: 0.3515
## Rate Proportion
## 1 0.01038367 0.25
## 2 0.14472115 0.25
## 3 0.67297557 0.25
## 4 3.17191961 0.25
##
## Rate matrix:
## a c g t
## a 0.000000 3.5989724 13.7579458 3.801364
## c 3.598972 0.0000000 0.4701416 25.006655
## g 13.757946 0.4701416 0.0000000 1.000000
## t 3.801364 25.0066550 1.0000000 0.000000
##
## Base frequencies:
## a c g t
## 0.3321866 0.1990791 0.2040652 0.2646691
Next, I’m going to pull the phylogeny (alone) off my optimized "pml"
object and re-root it, using "Platypus"
as my outgroup as follows. Since my node labels contain bootstrap support values, it’s very important that I indicate edgelabel=TRUE
in my re-rooting call.
mammal_tree<-root(mammal.pml$tree,"Platypus",resolve.root=TRUE,
edgelabel=TRUE)
mammal_tree
##
## Phylogenetic tree with 47 tips and 46 internal nodes.
##
## Tip labels:
## Platypus, Wallaroo, Possum, Bandicoot, Opposum, Armadillo, ...
## Node labels:
## Root, 0.901, 1, 1, 0.562, 1, ...
##
## Rooted; includes branch lengths.
Now for our support node value custom function. Here, the key piece of code will be get("last_plot.phylo",envir=.PlotPhyloEnv)
which gives us an object containing all the node coordinates of our most recent plotted tree. With this in hand it’s a very straightforward thing to add points of different colors to the nodes – depending on the bootstrap support of each clade in the reconstructed phylogeny.
add_support_labels<-function(node=NULL,support,
cols=c("white","black"),cex=1.5){
scale<-c(min(support,na.rm=TRUE),1)
support<-(support-scale[1])/diff(scale)
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
if(is.null(node)) node<-1:pp$Nnode+pp$Ntip
colfunc<-colorRamp(cols)
node_cols<-colfunc(support)/255
x<-pp$xx[node]
y<-pp$yy[node]
for(i in 1:nrow(node_cols)){
if(!is.na(support[i]))
points(x[i],y[i],pch=21,,cex=cex,
bg=rgb(node_cols[i,1],node_cols[i,2],node_cols[i,3]))
}
invisible(scale)
}
OK, so let’s try it! We have to be careful here, though. The way we wrote our function it expects numeric support values for nodes – but the way node labels are stored is as text. This is easy enough to fix using as.numeric
as follows.
bs<-as.numeric(mammal_tree$node.label)
## Warning: NAs introduced by coercion
bs
## [1] NA 0.901 1.000 1.000 0.562 1.000 0.973 0.344 0.773 1.000 1.000 1.000 0.671 1.000
## [15] 0.739 0.989 0.987 0.999 0.994 0.999 0.627 1.000 0.997 0.758 0.704 1.000 0.976 1.000
## [29] 1.000 0.998 1.000 1.000 1.000 1.000 0.948 0.991 0.762 1.000 1.000 1.000 0.861 0.903
## [43] 1.000 1.000 1.000 NA
Finally, our plot!
plotTree(ladderize(mammal_tree),fsize=0.7)
minmax<-add_support_labels(support=bs)
add.color.bar(leg=0.3*max(nodeHeights(mammal_tree)),
cols=colorRampPalette(c("white","black"))(100),
lims=minmax*100,title="bootstrap %",
subtitle="",prompt=FALSE,x=0.01*par()$usr[2],
y=0.02*par()$usr[4])
Now, in the actual user question, the list subscriber wanted to color node labels discretely – such that support values > 0.95 were colored black, those < 0.95 and > 0.75 a different color (say), and so on.
Without modification, we can do that using our function of above – just by updating the input. Let’s see.
bs[bs<0.75]<-0
bs[intersect(which(bs>=0.75),which(bs<0.95))]<-0.5
bs[bs>=0.95]<-1
plotTree(ladderize(mammal_tree),fsize=0.7)
add_support_labels(support=bs)
legend("bottomleft",c("> 0.95","< 0.95 & > 0.75","< 0.75"),
pch=21,pt.bg=c("black",rgb(0.5,0.5,0.5),"white"),
bty="n",pt.cex=1.5,cex=0.8)
OK. That’s it for now!