## Wednesday, May 22, 2013

### extract.clade for tree with a mapped discrete character

I have used make.simmap to obtain 1000 stochastic maps for a larger clade. It contains two sister-clades. My main interest is to compare transitions for these two sister-clades. describe.simmap gives average values for the entire clade, while, countSimmap provides number of transitions for 1000 trees. Is there a way to obtain these values individually for sub-clades within phytools, or do I have to run the analyses separately for each sister-clade to obtain these values?

Well, firstly, I would not recommend running stochastic mapping separately for each subtree of interest in your phylogeny - unless you would like to assume that the substitution process for your discrete character is different in different parts of the tree. (Even if it is - unless our subtrees are large then we probably don't have enough information to estimate this difference anyway.) What we'd really like to do is run stochastic mapping on the full tree, and then extract our clades of interest to run describe.simmap on each subtree. Unfortunately - extract.clade from the 'ape' package won't preserve the discrete character mapping on our stochastic mapped tree!

Luckily, there is a solution. phytools does have an analogous function to drop.tip (called drop.tip.simmap), and drop.tip and extract.clade are (in some ways) just the inverse of each other. Let's try to write a mapping compatible version of extract.clade for phylogenies with a mapped discrete character using drop.tip.simmap:

x<-getDescendants(tree,node)
x<-x[x<=length(tree\$tip.label)]
drop.tip.simmap(tree,tree\$tip.label[-x])
}

OK, that was easy. Now let's try it:

> Q<-matrix(c(-1,1,1,-1),2,2)
> colnames(Q)<-rownames(Q)<-letters[1:2]
> tree<-sim.history(pbtree(n=100,scale=1),Q)
> cols<-setNames(c("blue","red"),letters[1:2])
> plotSimmap(tree,cols,fsize=0.6,node.numbers=T,lwd=3, pts=F)
Now let's extract the clade descended from node 106 & plot: