Tuesday, September 5, 2017

Identifying all the tips consisting of all monophyletic clades in a tree with 11 taxa

A colleague just asked me the following:

“My question: if I have a particular number of taxa that I want to find a monophyletic group containing only that number of taxa out of a larger tree, is there such a function? For example, I have a tree of 100 taxa and I want a list (or lists) of exactly 11 taxa sharing an mrca, above that mrca should only contain 11 taxa and not more.”

This could be done multiple ways. Here's one using ape::extract.clade:

library(phytools)
set.seed(3)
tree<-pbtree(n=100)
nodes<-1:tree$Nnode+Ntip(tree)
Ndesc<-sapply(nodes,function(x,tree) Ntip(extract.clade(tree,x)),tree=tree)

plotTree(tree,ftype="off",lwd=1)
labelnodes(Ndesc,node=nodes,interactive=F,cex=0.5)

nodes11<-nodes[which(Ndesc==11)]
nodes11
## [1] 116 134 163
nodelabels(Ndesc[which(Ndesc==11)],node=nodes11,cex=0.8,frame="circle",
    bg="red",col="white")

plot of chunk unnamed-chunk-1

tips<-sapply(nodes11,function(x,tree) extract.clade(tree,x)$tip.label,
    tree=tree)
colnames(tips)<-nodes11
tips
##       116   134   163  
##  [1,] "t47" "t37" "t72"
##  [2,] "t48" "t95" "t73"
##  [3,] "t36" "t96" "t32"
##  [4,] "t89" "t42" "t43"
##  [5,] "t90" "t74" "t44"
##  [6,] "t50" "t75" "t88"
##  [7,] "t52" "t82" "t97"
##  [8,] "t53" "t83" "t98"
##  [9,] "t57" "t69" "t7" 
## [10,] "t58" "t70" "t59"
## [11,] "t49" "t38" "t60"

1 comment:

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.