Yesterday (more accurately, early this morning), I posted about a new plotting method for plotting a backbone phylogeny with diversities for each subtree represented as a triangle. This type of visualization is quite common in the literature; however, I'm told (by Luke Mahler, so blame him if this is incorrect) that there is no such plotting method so far in R.
The way I did this was by first creating a new type of phylogeny object, of class "backbonePhylo". This object is stored in a very similar way to the typical phylogeny object in R (class "phylo"), which the exception that we have replaced the vector tip.label with the list tip.clade. tip.clade is a list in which each element contains a label, a diversity (i.e., the number of tips hidden in the subtree), and a crown depth.
I have written some of the methods around this object as S3 methods, which means they can be dispatched using (for instance) calls to the generic plot, reorder, print, etc.
This morning, I did a little more work on this suite of functions. Most importantly, I created a new function (phylo.toBackbone) that converts an object of class "phylo" to an object of class "backbonePhylo" with the help of a translation table. The translation table is a data frame with four variables: tip.label (containing the names of the tips in the input tree - these might be exemplar taxa); label (containing the names of the subtrees - these might be the names of higher taxa); N (obviously, the diversities of each subtree); and depth (the age of each crown group, in terms of the branch lengths of the input tree - note that this cannot be longer then the corresponding tip edge). Note that not all tips in the original tree need to be in the translation table. Those that are not will be treated as clades containing one taxon in the backbone tree.
Here's a demo using random subtree sizes on a stochastic backbone tree. New code for this is here.
> library(geiger) # to help us build our tree
> source("backbonePhylo.R")
> ## now let's create our backbone tree
> tree<-transform(pbtree(n=10),model="lambda",lambda=0.5)
> ## for old versions of geiger, use lambdaTree
>
> ## create a translation table
> ## leaving a couple of single-taxon clades for fun
> tip.label<-sample(tree$tip.label,8)
> clade.label<-LETTERS[1:8]
> N<-ceiling(runif(n=8,min=1,max=20))
> ## set crown node depth to 1/2 the maximum depth
> depth<-sapply(tip.label,function(x,y) 0.5*y$edge.length[which(tree$edge[,2]==which(y$tip.label== x))],y=tree)
> trans<-data.frame(tip.label,clade.label,N,depth)
> rownames(trans)<-NULL
> rm(tip.label,clade.label,N,depth)
>
> ## here's what trans looks like
> trans
tip.label clade.label N depth
1 t2 A 6 0.8992932
2 t4 B 5 0.7617329
3 t3 C 16 0.8674696
4 t5 D 9 0.5984337
5 t1 E 5 0.8992932
6 t10 F 7 0.5554955
7 t7 G 8 0.5664347
8 t9 H 7 0.5554955
>
> ## convert
> tt<-phylo.toBackbone(tree,trans)
>
> ## plot
> plot(tt)
That's pretty good.
Nice Liam.
ReplyDeleteI think I may have mentioned that the diversitree package at one point had a function that would do this (plot.clade.tree), but that function doesn't seem to be present in the current CRAN version of the package (0.9-3). Regardless, I think it's super useful. It's pretty common that people want to illustrate clades in a manner that reflects their true diversities. Unless you have a completely sampled tree (i.e., almost never), this would require manually modifying a tree to contain polytomies with the correct numbers of species, plotting these in Figtree, and then collapsing the polytomies to triangles. This is much better!
This is great! I haven't tried running the code yet, but it seems like there is some relation between depth and the width of the triangles. For example, G has N = 8, whereas for F N = 7, but in the plot it looks like G is more diverse... Is there a way to account for that?
ReplyDeleteThe horizontal depth of each subtree can be controlled by the user. You just set it in the data frame trans when using phylo.toBackbone. It just cannot be deeper than the total terminal branch length of the backbone tree edge that the triangle is replacing. I arbitrarily set it to 1/2 that length in this example (which is why the depth varies among clades).
DeleteHi Liam,
ReplyDeleteI'm not sure I understand what exactly goes into the translation table and how this relates to the tree for phylo.toBackbone. I know which tips are present in the subtrees I want to collapse, so do I generate a table with every single tip listed under tip.label and then repeat clade labels, N and depth for all rows in each clade? e.g.
tip.label clade.label N depth
samp1 A 1 0.1
samp2 B 2 0.1
samp3 B 2 0.1
samp4 C 3 0.2
samp5 C 3 0.2
samp6 C 3 0.2
Or do I have to first collapse the subtrees and then create a backbone tree based on the remaining tips and a translation table like the example above?
Many Thanks,
James
Did you manage to get the backbone tree working? I think the scenario you presented on the phytools website is similar to mine.
DeleteAny of you get the backbone tree? I have the same issue and I am working with 95000 tips :(
DeleteCan we do this in cophylo?
ReplyDeleteHello everyone!
ReplyDeleteI've been racking my brain for a couple of weeks with these few lines of code, I hope you can help me.
I'm trying to create a backbontree using a phylogeny with 9000+ terminals. I have reviewed most of the entries (probably all) related to the use of this function, however I am stopped because I am receiving two errors when graphing.
The first when I include "row.names=1" when loading my data:
1. Error in xy.coords(x, y) : 'x' and 'y' lengths differ
And the second when I do not include "row.names=1" when loading the data
2. Error in col[sapply(x$tip.clade, function(x) x$label)] :
invalid subscript type 'list'
I also noticed that when generating the Backbonephylo object it does not respect the tip.label name.
Have any of you run into this problem or know how to solve it?