A few days ago, I received an email with the following text:
"What I have been trying to do is to color the branches of a tree according to the number of steps under parsimony that are required if certain species is attached to the corresponding branch of a backbone tree. As I am paleontologist, I usually need to place certain morphological-defined species onto a backbone constraint tree. But I could not find yet the function to do that. Any help is appreciated!"
Like many such emails (unfortunately) I flagged it to (hopefully) return to later, and then I rediscovered it today. Upon closer look, my first reaction was "why do people think I know how to do these things?" As I started to respond as such, it struck me that in fact this shouldn't be that hard (thanks, in large part, to Klaus Schliep's great package phangorn).
First, let's simulate the empirical case in which the 'backbone' tree for our N taxa is 'known', but we have data for N+1 taxa and we want to know where that extra tip belongs:
require(phytools)
require(phangorn)
## simulate tree
tt<-pbtree(n=26,tip.label=LETTERS[26:1])
## this is our true, full, unrooted phylogeny
plot(unroot(tt),use.edge.length=FALSE,type="unrooted", edge.width=2)
X<-genSeq(tt,l=2000,format="phyDat",rate=0.01)
## now let's drop one of our tips, tip "Z" in this case
tree<-drop.tip(tt,LETTERS[26])
tree<-unroot(tree) # unroot
OK, now we have a data set (X) containing 26 taxa; but an unrooted tree (tree) containing only 25 tips, and we want to know (& plot) the change in parsimony score that results from attaching the one remaining tip to all of the 48 places in the tree to which it could be attached!
## this is just for bind.tip
tree$edge.length<-rep(1,nrow(tree$edge))
## vector for parsimony scores
ps<-vector(length=nrow(tree$edge))
names(ps)<-apply(tree$edge,1,paste,collapse=",")
## tip to attach
tip.label<-LETTERS[26]
## loop over all edges in the tree
for(i in 1:nrow(tree$edge)){
## attach tip
x<-bind.tip(tree,tip.label,1,where=tree$edge[i,2],
position=0.5)
x$edge.length<-NULL
## compute parsimony score
ps[i]<-parsimony(x,X)
}
## subtract parsimony score on 25-taxon tree
ps<-ps-parsimony(tree,X)
Let's see what ps (the parsimony cost of attaching the tip to all edges of the tree) looks like in this one case:
26,1 26,27 27,2 27,3 26,28 28,4 28,29 29,30 30,5
121 122 172 172 97 124 121 146 178
30,31 31,32 32,6 32,7 31,8 29,33 33,34 34,35 35,36
178 211 211 212 211 147 149 155 168
36,9 36,10 35,37 37,38 38,11 38,12 37,13 34,39 39,40
216 216 169 212 224 222 210 153 183
40,41 41,14 41,15 40,16 39,42 42,43 43,44 44,17 44,18
202 216 216 203 183 210 210 217 217
43,19 42,20 33,45 45,46 46,47 47,21 47,22 46,23 45,48
209 209 148 200 219 235 235 219 200
48,24 48,25
216 218
names(ps) here gives the starting & ending node numbers for each edge in the tree & ps its corresponding parsimony cost.
The neatest thing is that we can easily plot these on the tree using the phytools function plotBranchbyTrait (which, unlike other plotting functions in phytools, calls the ape function plot.phylo internally):
It's pretty clear that the cost of attaching tip "Z" is indeed lowest along the edge (the internode connecting the terminal edges leading to tips "V" and "Y") where tip "Z" actually belongs. Cool.
Note that if we had a model for the evolution of our characters (in this case, our data are DNA sequences - but the query pertained to morphological data from fossils so this is not guaranteed) we could easily construct a similar plot showing differences in the likelihood across the tree.
That's it.
That's great!! Many of us (paleontologists) have been coloring the branches of the tree manually!! Again, thanks for your time and effort (and probably love) to support open-source tools.
ReplyDeleteI just noticed that "length = 2.7" is plotted on the legend. That doesn't actually mean anything in this case because the tree does not have branch lengths. (It could, but it does not in this visualization.) There is probably a way to turn that off in plotBranchbyTrait, but I don't remember. - Liam
ReplyDelete