Saturday, March 2, 2024

A simple function for node depths, by various definitions

Recently, a colleague asked how to compute to depth of one or a set of nodes on a tree.

As opposed to height above the root, for a non-ultrametric tree the meaning of node depth is somewhat ambiguous. For example, node depth might be defined as the distance of a node below the highest tip of the tree (overall); the distance below the highest tip descended from the node of interest; or the distance below the lowest tip descended from the node of interest (i.e., ancestor-descendant proximity to any leaf).

Typically, I believe that the first of these three definitions will be the value that’s being sought. If so, it’s indeed very straightforward to create a wrapper of phytools::nodeheight that will compute this quantity. (Note that if we have a lot of depths to compute at once on a large tree, it will normally be faster to use phytools::nodeHeights – but that uses more code, so I won’t show it here.)

node_depth<-function(phy,node=NULL){
  if(is.null(node)) node<-1:phy$Nnode+Ntip(phy)
  h<-max(nodeHeights(phy))
  setNames(h-sapply(node,nodeheight,tree=phy),node)
}

Let’s test this function out.

To do so, I’ll first simulate a non-ultrametric random tree using ape::rtree, then I’ll graph it leftwards but with my axis flipped – so that the right side of the tree is at 0.0 and the tree depth will correspond to the horizontal position of each node along the x axis. Finally, I’ll compute the node depths for all internal nodes and plot them on top of the tree using arrows pointing from each node to the horizontal axis of my graph.

library(phytools)
tree<-rtree(n=6 )
plotTree(tree,direction="leftwards",
  xlim=c(max(nodeHeights(tree)),0),
  mar=c(3.1,1,1.1,1.1))
grid()
axis(1,pretty(nodeHeights(tree)))
nd<-node_depth(tree)
nd
##         7         8         9        10        11 
## 1.8782436 1.6916639 1.1627144 0.7354517 0.9366856
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
arrows(
  x0=pp$xx[1:tree$Nnode+Ntip(tree)],
  y0=pp$yy[1:tree$Nnode+Ntip(tree)],
  x1=nd,y1=rep(par()$usr[3],tree$Nnode),length=0.1,
  col=make.transparent("red",0.5),lwd=5)
nodelabels()

plot of chunk unnamed-chunk-2

Now, if we want our function to apply either of the other two possible definitions of “node depth” things gets a tiny bit more complicated.

To write our slightly more elaborated function, I’m going to denominate our three different depth criteria "total" (for the difference in height from the maximum height), "max" (for the difference in depth to the maximum tip height of each clade), and "min" (for the difference to the minimum tip height in the clade).

Here we go. First, this is our function.

node_depth<-function(phy,node=NULL,from="total"){
  if(is.null(node)) node<-1:phy$Nnode+Ntip(phy)
  if(from=="total") h<-max(nodeHeights(phy))
  else if(from%in%c("min","max")){
    foo<-function(node,tree){
      dd<-getDescendants(tree,node)
      dd[which(dd<=Ntip(tree))]
    }
    desc<-lapply(node,foo,tree=phy)
    foo<-function(nodes,tree,from){
      fn<-if(from=="min") min else max
      fn(sapply(nodes,nodeheight,tree=tree))
    }
    h<-sapply(desc,foo,tree=phy,from=from)
  }
  return(setNames(h-sapply(node,nodeheight,tree=phy),node))
}

Now I’ll test it out by generating a random (non-ultrametric) tree, graphing it as before, and then adding arrows from the nodes that point forward in time towards the tips of the tree – merely to show that these arrows align with the minimum or maximum tip height of each node’s subtree! (This is what we were going for.)

We can start with the minimum height of each clade.

library(phytools)
tree<-rtree(n=8)
plotTree(tree,direction="leftwards",
  xlim=c(max(nodeHeights(tree)),0),
  mar=c(3.1,1,1.1,1.1))
grid()
axis(1,pretty(nodeHeights(tree)))
nd<-node_depth(tree,from="min")
nd
##         9        10        11        12        13        14        15 
## 1.0834354 0.5526454 0.5911463 0.1298558 0.3592411 0.4492303 0.2584885
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
arrows(
  x0=pp$xx[1:tree$Nnode+Ntip(tree)],
  y0=pp$yy[1:tree$Nnode+Ntip(tree)],
  x1=pp$xx[1:tree$Nnode+Ntip(tree)]-nd,
  y1=pp$yy[1:tree$Nnode+Ntip(tree)],length=0.1,
  col=make.transparent("red",0.25),lwd=5)
nodelabels()

plot of chunk unnamed-chunk-4

Now proceed to the maximum height of each clade.

library(phytools)
tree<-rtree(n=8)
plotTree(tree,direction="leftwards",
  xlim=c(max(nodeHeights(tree)),0),
  mar=c(3.1,1,1.1,1.1))
grid()
axis(1,pretty(nodeHeights(tree)))
nd<-node_depth(tree,from="max")
nd
##         9        10        11        12        13        14        15 
## 3.1375804 0.9654019 0.9140639 2.2835820 1.5188923 1.4671948 0.9892051
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
arrows(
  x0=pp$xx[1:tree$Nnode+Ntip(tree)],
  y0=pp$yy[1:tree$Nnode+Ntip(tree)],
  x1=pp$xx[1:tree$Nnode+Ntip(tree)]-nd,
  y1=pp$yy[1:tree$Nnode+Ntip(tree)],length=0.1,
  col=make.transparent("red",0.25),lwd=5)
nodelabels()

plot of chunk unnamed-chunk-5

Neat. That totally worked.

1 comment:

  1. Hi Liam, this is great! Do you think it would be possible to extract the branch lengths between internal nodes, so that can have split into different clades based on a maximum internal branch length?

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.