Monday, March 11, 2013

Neat new function to color branches by trait value

I had a user request yesterday for a function that would plot edge colors by probability - for example, if we have a value on [0, 1] for each edge in the tree - how do we use a built-in color palette in R to plot the tree using these edges as our colors? I think that the idea is somewhat akin to what contMap or densityMap accomplish; however if all we want is to plot each edge a different color, the 'ape' function plot.phylo is just fine at doing that.

I wrote a new function, plotBranchbyTrait, which automates this - calling plot.phylo internally; but the central piece of code used by the function is relatively simple. It merely takes our probabilities or reconstructed trait values along branches of the tree and translates them to colors from a palette as follows (using, in this case, a blue→red color map):
cols<-rainbow(1000,start=0.7,end=0) # blue->red
if(is.null(xlims)) xlims<-range(x)+c(-tol,tol)
breaks<-0:1000/1000*(xlims[2]-xlims[1])+xlims[1]
whichColor<-function(p,cols,breaks){
  i<-1
  while(p>=breaks[i]&&p>breaks[i+1]) i<-i+1
  cols[i]
}
colors<-sapply(x,whichColor,cols=cols,breaks=breaks)

What we come out with is a vector of colors to use as input in the argument plot.phylo.

I also have another internally called function, add.color.map, which takes user input for location and then plots a translation map (that also serves as a scale bar, as in, for instance, densityMap). I'd never done this before, but getting an interactively user supplied position for the color map was pretty easy. Here is the code snippet that I used:
cat("Click where you want to draw the bar\n")
x<-unlist(locator(1))
y<-x[2]
x<-x[1]

This is based on something similar in the ape function add.scale.bar.

One warning to the user - the function hangs before printing the "Click where you want to draw the bar" and waits for you to supply the location before the prompt! Not sure how to fix this.

The function has three different modes. mode="edges" uses the user supplied trait value for all the edges in the tree. The input values should be in the row order of tree$edge. mode="tips" just takes tip values and it reconstructs ancestral states by using fastAnc to compute the ancestral node states, and the averages the tipward and rootward states for each edge to get the plotted color. mode="nodes" does the same averaging, but the tip and node values are user supplied.

Here's a quick and dirty demo:
> source("plotBranchbyTrait.R") # load source
> # simplest use: ancestral BM values
> tree<-pbtree(n=40)
> x<-fastBM(tree)
> plotBranchbyTrait(tree,x,method="tips")
Click where you want to draw the bar
> # now let's say we have probabilities for each edge
> pp<-fastBM(tree,bounds=c(0,1),internal=TRUE)
> p<-rowMeans(matrix(pp[tree$edge],nrow(tree$edge),2))
> plotBranchbyTrait(tree,p,xlims=c(0,1),palette="gray")
Click where you want to draw the bar

Other methods in this function should be fairly self explanatory, I hope.

That's it.

24 comments:

  1. Hi Liam,

    This is a great little script, I spent the best part of two days writing some exceedingly ugly code in an attempt to do something very similar. Is there a simple fix for correctly shading the ‘horizontal’ edge lengths with plot.phylo(type = “fan”)?

    Jonathan

    ReplyDelete
    Replies
    1. Unfortunately, unlike my other plotting functions, this function just calls plot.phylo internally - so the problem is with plot.phylo and I can't do much about it. Sorry!

      Thanks for pointing out that this could be done with the function. I hadn't realized it! More on this here.

      - Liam

      Delete
  2. Dear Dr. Revell,

    Great function. I had a very large tree and the default edge.width of 4 was too big. Unfortunately, I couldn't set this (since it was already set). I just removed "edge.width=4" from the line 35 (I think - the plot.phylo call). Then I was able to see all the edges for a big tree and made a series of fantastic figures. Maybe removing edge.width would be useful for others.

    Thanks,
    Daniel Hanley

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Dr. Revell,

    I encountered a problem when using this function; it may be because I am not importing the data correctly. I uploaded a phylip tree and a data table into R, then used the following set of functions to associate the data with the tips:

    trait<- data[1][tree$tip.label,]

    names(trait)<-tree$tip.label

    This appeared to make a vector in the appropriate input format for the above function. However, when I called the function, the color pattern among the branches didn't jive with the data I input. To test an extreme case, I created a new vector with character values that were inflated for only a few tips ("1" for most; "5" for a few). The resulting colored tree contained "hot" branches but for the incorrect tips; the vector created appeared to show the correct order. Am I mapping the characters to the tree incorrectly? Any reply would be appreciated.

    Thanks,

    John McVay


    ReplyDelete
    Replies
    1. Hi John.

      Are you using the correct "mode". If your data are the data for tip taxa, then you should use mode="tips".

      Let me know if this solves your problem. If it does not, feel free to shoot me the data & code that you used to generate the error and I will look into it.

      Finally, make sure that you have the latest version of phytools (at least the most recent CRAN release).

      All the best, Liam

      Delete
  5. This comment has been removed by the author.

    ReplyDelete
  6. Dear Liam,
    Thanks for sharing with us that useful piece of code. While looking into the code I could not see any special features related to mode=="edge", while it should order the data in a particular way when considering both nodes and tips:
    http://stackoverflow.com/questions/22100809/in-r-how-to-order-a-vector-of-data-according-to-both-tips-and-edges-of-a-phylog/22102420?noredirect=1#comment33527013_22102420
    Can it otherwise be that I have an older version? (R 3.0.2)
    Regards,
    Xavier

    ReplyDelete
  7. Dr. Revell,
    This is a fantastic piece of code. Thanks for this. But do you think you could provide an example of the format the input data should be in to use the 'edge' mode. I have done some ancestral state reconstructions and want to color the tree with this code, however, I can't seem to get the function to perform properly with the ways in which I'm inputting the reconstructed nodes as well as tips.

    Thanks

    -JDP

    ReplyDelete
    Replies
    1. Hi JDP.

      If I remember correctly, mode="edge" just takes a vector in the order of the rows of tree$edge. Let me know if this is enough info for you to figure out how to proceed.

      If you have done ancestral state reconstructions, you might consider mode="nodes", which computes the average of the two flanking nodes to get the color for each branch. In this case, x is a vector with length(tree$tip.label)+tree$Nnode elements in the order of the node numbers in tree$edge.

      Let me know if you need help figuring this out.

      - Liam

      Delete
    2. Dr. Revell,

      I am having a similar issue when plotting edges:

      data:
      a b a-b c d c-d a-d
      0 0 1 1 0 0 1

      tree.mod:
      ((a:1,b:1)a-b:1,(c:1,d:1)c-d:1)a-d;

      Running:

      R
      library(ape)
      library(phytools)
      f=read.table("data", header=T)
      f = data.matrix(f)
      t=read.tree("tree.mod")
      source("plotBranchbyTrait.R")
      plotBranchbyTrait(tree=t,x=f,mode="edges",xlim=c(0,1))

      The resulting plots does not follow the data. After having spent (much) time on trying to order the labels, I now don't have a clue.

      Thanks,

      X

      Delete
    3. Hi Xavier.

      Yes, this is pretty straightforward.

      For mode edges your vector should be in the order of the rows of tree$edge. To see these edge & line them up with the plotted edges of your tree you can do:

      tree$edge
      plotTree(tree)
      nodelabels()
      tiplabels()

      The plotted node labels and tip labels are the node indices in tree$edge.

      In your case, you also have 7 values in your vector but only 6 edges in the tree. It is possible that what you want to do in that case is use mode="nodes". In this case the edge color is based on the average of the tipward & rootward node of each edge. The elements in the vector should be in the numerical order of the indices of tree$edge in this case.

      Finally, you should install the latest version of phytools rather than loading the function from source.

      All the best, Liam

      Delete
    4. Dear Liam,
      Thanks for your prompt answer, I do appreciate.

      I want to plot edges and thus removed the node a-d from the dataset, and ordered the vector by adding:
      x = x[tree$edge[,2]]
      to your script.

      data:
      a b a-b c d c-d
      0 0 1 1 0 0

      tree.mod:
      ((a:1,b:1)a-b:1,(c:1,d:1)c-d:1)a-d;

      Ran again (with the latest version of phytools):

      R
      library(ape)
      library(phytools)
      f=read.table("data", header=T)
      f = data.matrix(f)
      t=read.tree("tree.mod")
      source("plotBranchbyTrait.R")
      plotBranchbyTrait(tree=t,x=f,mode="edges",xlim=c(0,1))

      and got:
      Error in while (p >= breaks[i] && p > breaks[i + 1]) i <- i + 1 :
      missing value where TRUE/FALSE needed

      Even though the vector follows the order of the edge, I still cannot make that simple example work. Do you maybe see what I am doing wrong?

      Thanks,

      Xavier

      Delete
    5. Most likely your problem is that f is not a vector. Try:

      f<-read.table("data",header=TRUE)
      f<-as.matrix(f)[,1]

      Also, you should not load plotBranchbyTrait from source as you do. It is now part of phytools:

      rm(plotBranchbyTrait)

      All the best, Liam

      Delete
    6. Dear Liam,

      I managed to fix it, by adding in plotBranchTrait.R:

      if( mode == "edges" ){
      lab=c(tree$tip.label,tree$node.label)
      lab = lab[tree$edge[,2]]
      x = x[match(lab,names(x))]
      }

      Furthermore the data had to be open using:
      f=read.table("data", header=T, check.names=FALSE)
      f = as.matrix(f)[1,]

      to be sure that the nodes names keep the hyphen.

      Cheers,
      Xavier

      Delete
  8. Dear Liam,
    that's a great function, very useful!. I wonder: I have a few numeric extrinsic traits variables (like e.g. extinction risk) for which I cannot assume trait evolution and where It doesn't make sense to reconstruct ancestral states.
    My way forward for visualizing such data in phylognetic space would be to simply color all branches according to the average trait value of all tips that descend from that branch (so not accounting for branch length or anything). I can't figure our how to do this. Do you have an idea how I would have to change your function to compute for each branch the mean trait value of all its tips and assign the color accordingly?
    Thanks a lot! Best,
    Carsten

    ReplyDelete
    Replies
    1. Hi Carsten.

      I just addressed this here. Thanks for the question.

      - Liam

      Delete
  9. Dear Liam,

    I find very interesting this tool, but I have some problems with the mode="node". I have my nodes information done with ancestral state reconstruction previusly, and I want to plot them in the tree. I put tips and nodes in order as you explain above, but the result is not congruent.

    There are some areas of the tree that I have the same value in the tips and in the nodes and the colors of edges are not the same or similar...Is there any problem with this mode? Must tree format be in a particular way? (Cronogram, phylogram, branch-lengths) Let me know If you could help me.

    Thanks.

    ReplyDelete
    Replies
    1. Hi Raquel.

      The data should be a vector in order tree$tip.label followed by the 1:tree$Nnode+Ntip(tree) internal nodes.

      To see what this looks like, try out:

      tree<-pbtree(n=26,tip.label=LETTERS)
      x<-fastBM(tree,internal=TRUE)
      x ## correct order
      plotBranchbyTrait(tree,x,mode="nodes")

      Let me know if that works!

      - Liam

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Hi Liam,

      I don´t know why but I get the same results. It seems that something doesn´t work well, but I can´t figure out what it is. I sent you an email with my doubts. I ´d appreciate if you could help me. Thank you very much.
      Best,

      Raquel

      Delete
  10. Hi Liam.

    Thanks for the great posts.
    I would like to colour the branches of a split network, instead of a tree.
    I didn't have success with this function, probably because the object is of the class network as well.
    Do you have some directions to help me to accomplish this?

    Thanks,
    Fernando

    ReplyDelete
  11. Hi Liam,
    I´m a master student who is aiming to disentangle the evolution of a neotropical interaction. I used allot the functions in phytools to plot results (i.e., cophyloplot) and now I want to create a plot.phylo object that contains a tree with colored branches by the most probable ancestral-range. Then, use this tree in cophyloplot function. However, I am struggling with extract the node branching probabilities of the tree (BioGeoBears result object) for construct the x object in your function plotBranchesbyTrait.
    Could you please guide me with a few steps on how to create the x object?
    thank you for your time!
    María

    ReplyDelete
  12. Hi Liam,
    I´m a master student who is aiming to disentangle the evolution of a neotropical interaction. I used allot the functions in phytools to plot results (i.e., cophyloplot) and now I want to create a plot.phylo object that contains a tree with colored branches by the most probable ancestral-range. Then, use this tree in cophyloplot function. However, I am struggling with extract the node branching probabilities of the tree (BioGeoBears result object) for construct the x object in your function plotBranchesbyTrait.
    Could you please guide me with a few steps on how to create the x object?
    thank you for your time!
    María

    ReplyDelete