Today I was asked the following question (slightly paraphrased):
“How do I indentify the node defining the youngest subtree with N (for arbitrary N, in his case it was 5) descendant taxa?”
This is pretty easy. For demonstrative purposes, let's first simulate a tree:
library(phytools)
## simulate tree
tree<-pbtree(n=26,tip.label=LETTERS)
Now let's count the number of tips descended from each internal node of the tree:
## number of descendant taxa by node
N<-setNames(sapply(1:tree$Nnode+Ntip(tree),
function(x,tree) sum(getDescendants(tree,x)<=Ntip(tree)),tree=tree),
1:tree$Nnode+Ntip(tree))
## we can plot them to make sure we got it right
plotTree(tree,mar=c(2.1,0.1,0.1,0.1))
nodelabels(N)
Next, we can compute the depth of each node. We can do this in more than one
way. Here, I have an sapply
call around nodeheight
,
but for a larger tree I would use one call to nodeHeights
.
## node depths
d<-setNames(max(nodeHeights(tree))-
sapply(1:tree$Nnode+Ntip(tree),nodeheight,tree=tree),
1:tree$Nnode+Ntip(tree))
## again, let's check them by plotting
plotTree(tree,mar=c(2.1,0.1,0.1,0.1))
nodelabels(round(d,2))
axis(1,at=round(0:5/5*max(nodeHeights(tree)),2))
Finally, let's find the most recent node (or nodes) with n=5
or more descendant tips:
## which node is the youngest node to have n=5 or more
## descendant taxa
n<-5
node<-as.numeric(names(which(d[N>=n]==min(d[N>=n]))))
Let's check it visually:
plotTree(tree,mar=c(2.1,0.1,0.1,0.1))
axis(1,at=round(0:5/5*max(nodeHeights(tree)),2))
ii<-which(names(d)==node)
lines(max(nodeHeights(tree))-c(d[ii],d[ii]),
y=par()$usr[3:4],lty="dashed",col="red")
nodelabels(node=node,"X")
That's it!
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.