Thursday, May 12, 2011

Issue with extract.clade() relevant to evol.rate.mcmc()

Paolo Piras just clued me into an issue with the {ape} function extract.clade() that affects the pre-processing of the posterior sample from our two-rate Bayesian MCMC analysis performed using evol.rate.mcmc(). Basically, for some orderings of the $edge matrix (in particular, the "cladewise" order given by {phangorn} functions), extract.clade() appears to scramble the tip labels under certain conditions.

For instance, from Paolo's example (but with tip labels altered) if we take the following tree (with nodes labeled according to their indexes in phy$edge):



then extract the clade descended from node "35" and plot it:

> tr<-extract.clade(phy,node=35)
> plot(tr)


we get the following:



Uh-oh - for some reason the tip species "R" & "V" and "T" & "X" have been swapped in the extracted clade!!

It turns out that the easiest way to resolve this is by first computing:

> phy<-read.tree(text=write.tree(phy))

which will guarantee that extract.clade() works properly for this problem. One thing to note, though, is that doing so will change the indexes for the nodes in $edge. Thus, it is essential that this be done before running evol.rate.mcmc() and not after evol.rate.mcmc() but before posterior.evolrate(). I am presently trying to figure out what the smartest way to build this into evol.rate.mcmc() would be.

As an aside, though, this bug is not lethal to evol.rate.mcmc() as the specific ordering of the tip labels in the extracted subtree is not used by posterior.evolrate() (although creating the plotting given here will be affected).

3 comments:

  1. Liam:

    As far as I can tell, this bug in extract.clade() has been around for a couple of years and hasn't been fixed yet. I needed to extract clades for something I was doing, but wanted to preserve node ordering and labels, so I wrote a hack using the phylobase function descendants(), and the ape function drop.tip().

    my.extract.clade<-function(phy,node,root.edge=0){
    #requires ape and phylobase be loaded
    phy.phylo4<-phylo4(phy)
    save<-descendants(phy.phylo4,node)
    extracted<-drop.tip(phy,(1:length(phy$tip.label))[(1:length(phy$tip.label))%in%save==F],root.edge=root.edge)
    return(y)
    }

    Awkward, but seems to work. Hope it helps somebody.

    ReplyDelete
  2. I am facing some difficulties in using extrac.clade() function.
    I made an unrooted tree with the nj() function and obtain 1620 tips and 1618 nodes.
    When I am trying to extract.clade from node 1 for example, the following error message appears
    "Erreur dans extract.clade(alpha.nj, node = 1) :
    node number must be greater than the number of tips"
    Do you have any idea how I can fix this problem?
    I might just have forgotten some step. I'm not that familiar with R.
    Thank you in advance,

    ReplyDelete
    Replies
    1. The nodes in the tree are numbered from N -> N+number of nodes where N is the number of tips in your tree. That means that for your tree the internal node numbers (and thus the numbers acceptable for extract.clade) run from 1621 through 3238.

      To find the parent node of the clade you want to pull out, you should look at the function findMRCA in phytools.

      Delete