Wednesday, April 25, 2012

Simultaneously plot LTT from multiple trees in a "multiPhylo" object

Frank Burbrink asked:

is there an existing function to ltt.plot a Pp distriubtion of trees?

This already exists in the ape package (for ultrametric trees only) in the function mltt.plot. I have also added it to the phytools function ltt.

This was pretty straightforward to do, I just added code in the following general form to deal with the input of multiple trees in the form of an object of class "multiPhylo":

if(class(tree)=="multiPhylo"){
   X<-lapply(tree,ltt,plot=FALSE,drop.extinct=drop.extinct, log.lineages=log.lineages,gamma=gamma)
   max.lineages<-max(sapply(X,function(y) max(y$ltt)))
   ...
   if(plot==TRUE){
      ...
   }
   return(X)
} else {
   ...

I have posted the code online here. I will also include it in the next version of phytools.

Just to show how it works, let's test it out:

> # load phytools & the function source
> require(phytools); source("ltt.R")
> # simulate trees
> trees<-pbtree(n=100,scale=10,nsim=100)
> # plot ltts
> Z<-ltt(trees,log=F)
Pretty cool. The function also returns the branching times and number of lineages, as well as values for Pybus & Harvey's (2000) γ-statistic & P value. So, for instance, if we want to plot a histogram showing the distribution of γ among trees we can just do:

> gamma<-sapply(Z,function(y) y$gamma)
> hist(gamma)
Or if we would like to calculate the number of significant values of γ in our set:

> nsig<-sum(sapply(Z,function(y) y$p<0.05))
> nsig
[1] 3

Cool.

5 comments:

  1. Hi Liam, I was wondering if the function you've just posted could be used to get an estimate of whether, based on the posterior distribution of trees, there is departure from a birth death model. Do you think that, using area between curves, one could calculate, for each of the ltt.lines of the posterior distribution of trees (or multi.phylo object) the area below or above the ltt line of a birth-death model, giving a negative value to areas where the observed line falls below the birth-death line and positive values to areas where the observed line falls above the birth-death line and then summing over all areas? (Hope this is clear enough). It may be a relatively simple means of incorporating phylogenetic uncertainty in an estimate of departure from a BD model, based on ltt.plots.

    ReplyDelete
  2. Interesting suggestion, Alejandro. However, I think that rather than using the LTT plots, it would be better just to fit multiple models (say a simple BD model and a model in which speciation varies through time) to a large random sample of trees from the posterior. And then using AIC weights (or whatever model selection method you prefer) to calculate the support amongst your candidate models for the BD process. I think this is better because there is more information in the actual trees (with topology and branch lengths) than can be gleamed from the ltt plot alone.

    ReplyDelete
  3. The axes is reversed. Thanks for the cool function.

    ReplyDelete
  4. Dear all

    For comparative purposes, I have seen that the shape of the ltt-plot (log) changes radically for different groups (N>100). It is exponential for birds and mammals but a hump for diatoms, bacteria and yeasts. Does it make any sense?

    ReplyDelete
  5. Sorry... I forgot the most important part: does anyone know about references of ltt-plots in different groups across the tree of life?

    ReplyDelete