Thursday, April 10, 2014

Computing the height above the root of a node in the tree

phytools already has a function that computes the height of all the nodes in the tree: nodHeights. If we only want to know the height of one node, then using this function is very inefficient, for obvious reasons.

The following is alternative code for computing the height of only one node. It uses the phytools internal function getAncestors, so if we are not using it as part of phytools we will first have to do:

getAncestors<-phytools:::getAncestors

And here is the function:

nodeheight<-function(tree,node){
  if(node==(length(tree$tip.label)+1)) h<-0
  else {
    a<-setdiff(c(getAncestors(tree,node),node),
      length(tree$tip.label)+1)
    h<-sum(tree$edge.length[sapply(a,function(x,e)
      which(e==x),e=tree$edge[,2])])
  }
  h
}

That's it.

Tuesday, April 8, 2014

New options & methods for ltt95

A couple of days ago I received the following comment about the phytools function ltt95:

"I'm using ltt95 function with the times method and log transformed y axis, but I get reversed ages (both in the plot and in the results table). I’ve not been able to tackle down this small issue."

Well, this isn't precisely a bug of ltt95. ltt95 merely records the accumulation of lineages forward in time from the root; rather than backward (as I believe is more common in publications) from the present time.

In a newly updated version of the function I have now added the argument xaxis which can assume three values: "standard" (forward in time lineage accumulation, as in the current CRAN version of phytools); "flipped" (the time-from-the-present plot); and "negative" (similar to "flipped", but with negative time from the present). At the same time I also have modified ltt95 to return an object of class "ltt95" invisibly (don't worry, this is just the same matrix with some attributes); and I have added S3 generic plotting & print methods for objects of this class.

Here's a demo - but the function can be used in more diverse ways than I will demonstrate here:

> library(phytools)
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.4.0’
> ## simulate some trees
> trees<-pbtree(n=100,scale=50,nsim=100)
> trees
100 phylogenetic trees
> ## run ltt95
> obj<-ltt95(trees)
> ## ok, now let's flip the axis & plot on semi-log
> ## (I could have done this originally too)
> obj
Object of class "ltt95".
  alpha:        0.05
  method:       lineages
  mode:         median
  N(lineages):  100
> plot(obj,xaxis="flipped",log=TRUE)

Code for this new function version is here and it is in a phytools build, which can be downloaded & installed from source.

That's all for now.

Monday, April 7, 2014

New function to paint individual branch or branches of the tree

A few days ago I received the following email:

"I have recently come across your excellent paper entitled "On the analysis of evolutionary change along single branches in a phylogeny". In that you employed a software "available on request", so I am writing to request permission to obtain and use your software. Alternatively, I wonder whether the functions from that software were incorporated into the Phytools R package. I am sorry if I am missing the obvious and it is indeed in the Phytools package - I am not familiar with all functions in that package yet."

(One interesting attribute of this message is that it harks back to a day in the not too distant past in which sharing data or software 'on request' seemed more than reasonable - how quaint & passé that seems today!)

The paper that's being referred to here is Revell (2008; Am. Nat.) in which I show that a (at that time) new likelihood method by O'Meara et al. (2006) could be used to test hypotheses about exceptional evolution along (one or more) single branch(es) of the tree. Previously McPeek (1995) had proposed a cruder approach based loosely on contrasts, which I showed had reasonable type I error; however I also showed that the likelihood method was probably preferable under most circumstances.

Indeed this method is available in phytools in the form of brownie.lite; although to 'paint' branches which we have a priori hypothesized to have a higher or lower evolutionary rate is a little bit tricky. I decided to make this a whole lot easier by adding the function paintBranches to the phytools package (new source build here).

Here is an example workflow using the function to test the hypothesis of a higher rate of evolution along a specific branch of the tree:

> library(phytools)
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.3.99’
> ## simulate data with a high rate on one branch
> tree<-pbtree(n=26,tip.label=LETTERS[26:1])
> plotTree(tree)
> nodelabels()
> ## stretch one of the branches by x 100
> tt<-tree
> tt$edge.length[which(tt$edge[,2]==29)]<-
 tt$edge.length[which(tt$edge[,2]==29)]*100
> ## simulate data on distorted tree
> x<-fastBM(tt)
> ## paint branch with second regime to fit 2-rate model
> tree<-paintBranches(tree,29,"2")
> plotSimmap(tree,lwd=4)
no colors provided. using the following legend:
      1       2
"black"   "red"
> nodelabels()
> fit<-brownie.lite(tree,x)
> fit
ML single-rate model:
        s^2     se      a       k  logL
value   1.41    0.39    1.35    2  -39.37 

ML multi-rate model:
        s^2(1)  se(1)   s^2(2)  se(2)   a       k  logL 
value   0.96    0.26    40.24   60.27   0.12    3  -35.82

P-value (based on X^2): 0.01

R thinks it has found the ML solution.

We can use the same approach if our hypothesized exceptional regime is found on several branches in the tree. In this case, the function argument edge is an integer vector instead of a single node number. Finally, if we have an a priori hypothesis of multiple classes of exceptional rate, we can test that too using multiple calls to paintBranches.

That's it.

Tuesday, March 25, 2014

New version of Rphylip with Rseqboot & demo

I just submitted a new version (Rphylip 0.1-23) of the Rphylip package ('an R interface for PHYLIP') to CRAN. This version fixes a couple of bugs in the first CRAN version - including taxon name length limits in Rconsense and Rtreedist (which are present in the corresponding PHYLIP programs CONSENSE and TREEDIST, but can easily be circumvented when calling PHYLIP from within R). It also introduces the new function Rseqboot, which is an R interface two SEQBOOT. SEQBOOT implements a range of non-parametric bootstrapping, jacknife, and data permutation methods. Because it can take a variety of different character types as input, writing the interface was a bit of a pain in the butt - but it is finished to my satisfaction today.

Here's a demo of bootstrapping, distance matrix calculation from all bootstrap samples, and then consensus phylogeny estimation using the Rphylip package:

> require(Rphylip)
Loading required package: Rphylip
> packageVersion("Rphylip")
[1] ‘0.1.23’
> ## load primate dataset
> data(primates)
> ## basic bootstrapping
> X<-Rseqboot(primates,quiet=TRUE)
> ## compute distance matrices
> D<-lapply(X,Rdnadist,quiet=TRUE)
> ## compute all NJ trees
> ## note Rneighbor(...,quiet=TRUE) is
> ## not as quiet as it should be!
> trees<-lapply(D,Rneighbor,quiet=TRUE)
> ## reroot all trees using midpoint rooting
> require(phangorn)
Loading required package: phangorn
> trees<-lapply(trees,midpoint)
> class(trees)<-"multiPhylo"
> ## compute consensus tree
> tree<-Rconsense(trees,quiet=TRUE,rooted=TRUE)
> ## now plot the result with the bootstrap %s
> plot(tree,edge.width=2,no.margin=TRUE)
> ## find edges
> e<-sapply(2:tree$Nnode+length(tree$tip.label),
 function(x,y) which(y==x),y=tree$edge[,2])
> ## add bootstrap
> edgelabels(tree$node.label[2:tree$Nnode],e,pos=3,
 frame="none",bg="transparent")

Cool. Note that RETREE in PHYLIP does midpoint rooting, but this program cannot yet be called with Rphylip. The same general pipeline could be used with ML, MP, or other phylogeny inference methods in PHYLIP (although this would be slower, of course).

This version of Rphylip is already available from GitHub, but hopefully will also be accepted on CRAN soon.

Friday, March 14, 2014

Putting a barplot next to a plotted tree

Today a phytools user contacted me about creating a plot that looks like this. Well, I'm not going to try to duplicate this exactly, but here is a quick demo about how to put a bar plot next to a plotted tree with a continuous character map overlain:

## first let's simulate some tree & data to work with
tree<-pbtree(n=40)
x<-fastBM(tree)

Now let's create our plot:

## create a split plot
layout(matrix(c(1,2),1,2),c(0.7,0.3))
## plot our tree
xx<-contMap(tree,x,mar=c(4.1,1.1,1.1,0),res=200,plot=FALSE)
plot(xx,legend=FALSE,mar=c(4.1,1.1,1.1,0))
## click to add legend interactively
add.color.bar(1,cols=xx$cols,lims=xx$lims,title="")
## add bar plot
par(mar=c(4.1,0,1.1,1.1))
barplot(x[tree$tip.label],horiz=TRUE,width=1,space=0,
  ylim=c(1,length(tree$tip.label))-0.5,names="")

Obviously, the same approach could be used with an ordinary right-facing phylogram or a stochastic character mapped tree.

Wednesday, March 12, 2014

roundPhylogram now in phytools

A slightly more fully functional version of roundPhylogram (described on the blog earlier today) is now part of the phytools package. The function source code can be viewed here, and updated phytools package source can be downloaded from the phytools page.

Here's a demo, using the Caribbean anole tree:

> require(phytools)
Loading required package: phytools
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.3.97’
> data(anoletree)
> roundPhylogram(anoletree,ftype="i",fsize=0.7)

Plotting a right-facing round phylogram

Don't ask me why I'm working on this. Here's how to do it:

roundPhylogram<-function(tree){
  n<-length(tree$tip.label)
  # reorder cladewise to assign tip positions
  cw<-reorder(tree,"cladewise")
  y<-vector(length=n+cw$Nnode)
  y[cw$edge[cw$edge[,2]<=n,2]]<-1:n
  # reorder pruningwise for post-order traversal
  pw<-reorder(tree,"pruningwise")
  nn<-unique(pw$edge[,1])
  # compute vertical position of each edge
  for(i in 1:length(nn)){
    yy<-y[pw$edge[which(pw$edge[,1]==nn[i]),2]]
    y[nn[i]]<-mean(range(yy))
  }
  # compute start & end points of each edge
  X<-nodeHeights(cw)
  ## end preliminaries
  # open & size a new plot
  plot.new(); par(mar=rep(1.1,4))
  plot.window(xlim=c(0,1.05*max(X)),ylim=c(1,max(y)))
  # plot edges
  for(i in 1:nrow(X)){
    b<-y[cw$edge[i,1]]
    c<-X[i,1]
    d<-if(y[cw$edge[i,2]]>y[cw$edge[i,1]]) 1 else -1
    xx<-X[i,2]
    yy<-y[cw$edge[i,2]]
    a<-(xx-c)/(yy-b)^2
    curve(d*sqrt((x-c)/a)+b,from=X[i,1],to=X[i,2],add=TRUE,
      lwd=2)
  }
  # plot tip labels
  for(i in 1:n)
    text(X[which(cw$edge[,2]==i),2],y[i],tree$tip.label[i],
      pos=4,offset=0.3,font=2)
}

Now let's see how it works:

> library(phytools)
> source("roundPhylogram.R")
> tree<-pbtree(n=26,tip.label=LETTERS[26:1])
> roundPhylogram(tree)