Wednesday, May 22, 2013

extract.clade for tree with a mapped discrete character

A phytools reader recently asked the following via the general comments page:

I have used make.simmap to obtain 1000 stochastic maps for a larger clade. It contains two sister-clades. My main interest is to compare transitions for these two sister-clades. describe.simmap gives average values for the entire clade, while, countSimmap provides number of transitions for 1000 trees. Is there a way to obtain these values individually for sub-clades within phytools, or do I have to run the analyses separately for each sister-clade to obtain these values?

Well, firstly, I would not recommend running stochastic mapping separately for each subtree of interest in your phylogeny - unless you would like to assume that the substitution process for your discrete character is different in different parts of the tree. (Even if it is - unless our subtrees are large then we probably don't have enough information to estimate this difference anyway.) What we'd really like to do is run stochastic mapping on the full tree, and then extract our clades of interest to run describe.simmap on each subtree. Unfortunately - extract.clade from the 'ape' package won't preserve the discrete character mapping on our stochastic mapped tree!

Luckily, there is a solution. phytools does have an analogous function to drop.tip (called drop.tip.simmap), and drop.tip and extract.clade are (in some ways) just the inverse of each other. Let's try to write a mapping compatible version of extract.clade for phylogenies with a mapped discrete character using drop.tip.simmap:

extract.clade.simmap<-function(tree,node){
  x<-getDescendants(tree,node)
  x<-x[x<=length(tree$tip.label)]
  drop.tip.simmap(tree,tree$tip.label[-x])
}

OK, that was easy. Now let's try it:

> Q<-matrix(c(-1,1,1,-1),2,2)
> colnames(Q)<-rownames(Q)<-letters[1:2]
> tree<-sim.history(pbtree(n=100,scale=1),Q)
> cols<-setNames(c("blue","red"),letters[1:2])
> plotSimmap(tree,cols,fsize=0.6,node.numbers=T,lwd=3, pts=F)
Now let's extract the clade descended from node 106 & plot:
> tree106<-extract.clade.simmap(tree,106)
> plotSimmap(tree106,cols,fsize=0.8,lwd=3,pts=F)
Cool.

Note that although the basic functionality is the same, the options and arguments of extract.clade.simmap and extract.clade are not the same.

1 comment:

  1. VJ - please let us know if you see this post as I'd prefer not to 'reply' on the general comments page as for some reason posting to the Comments page clears the comment sidebar - to my considerable annoyance! Thanks! Liam

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.