Saturday, May 3, 2014

Dropping tips while retaining the ancestors of remaining extant tips as singleton nodes

Luke Mahler asked the following:

"Do you know of a way to drop a terminal branch from a phylogeny, yet preserve the node it came from as a singleton node? I initially thought drop.tip(trim.internal=F) would do this, but it does something a little different, apparently (it preserves internal branches that become tips by pruning, but not nodes that would become singleton nodes)."

In the simple case in which we just want to drop one tip, this is relatively straightforward. We just have to drop the corresponding row & element from tree$edge, tree$edge.length, and tree$tip.label, and then update our node & tip numbers in tree$edge to follow the "phylo" object convention. However, generalizing to drop an arbitrary number of tips (while retaining all ancestral nodes to extant tips, regardless of whether they now have one or multiple descendants) now becomes a little bit trickier. Here is my function for this:

drop.tip.singleton<-function(tree,tip){
  N<-length(tree$tip.label)
  m<-length(tip)
  ii<-sapply(tip,function(x,y) which(y==x),y=tree$tip.label)
  tree$tip.label<-tree$tip.label[-ii]
  ii<-sapply(ii,function(x,y) which(y==x),y=tree$edge[,2])
  tree$edge<-tree$edge[-ii,]
  tree$edge.length<-tree$edge.length[-ii]
  tree$edge[tree$edge<=N]<-
    as.integer(rank(tree$edge[tree$edge<=N]))
  tree$edge[tree$edge>N]<-tree$edge[tree$edge>N]-m
  N<-N-m
  if(any(sapply(tree$edge[tree$edge[,2]>N,2],"%in%",
    tree$edge[,1])==FALSE)) internal<-TRUE
  while(internal){
    ii<-which(sapply(tree$edge[,2],"%in%",c(1:N,
      tree$edge[,1]))==FALSE)
    nn<-sort(tree$edge[ii,2])
    tree$edge<-tree$edge[-ii,]
    tree$edge.length<-tree$edge.length[-ii]
    for(i in 1:length(nn)) tree$edge[tree$edge>nn[i]]<-
      tree$edge[tree$edge>nn[i]]-1
    tree$Nnode<-tree$Nnode-length(ii)
    if(any(sapply(tree$edge[tree$edge[,2]>N,2],
      "%in%",tree$edge[,1])==FALSE)) internal<-TRUE
    else internal<-FALSE
  }
  tree
}

Now try it:

> tree<-pbtree(n=26,tip.label=LETTERS)
> plotTree(tree)
> tip<-sample(LETTERS,10)
> tip
[1] "N" "M" "F" "I" "Z" "R" "P" "S" "G" "W"
> tt<-drop.tip.singleton(tree,tip)
> plotTree.singletons(tt)

This seems to be the correct result.

5 comments:

  1. This works in most cases, but it has a small bug. I believe I have already fixed it & I will post the code shortly.

    ReplyDelete
    Replies
    1. Thanks Liam, it worked for me! I haven't used "plotTree.singletons()" yet and will play around with it. Is it flexible at all? (i.e., can it take any of the standard plot.phylo() parameters?)

      Delete
    2. Hi Luke. The fixed code (already sent to you via email) is here.

      No - plotTree.singletons is not particularly functional. It was just written as a check. Do you have some more serious use for this?

      - Liam

      Delete
    3. Whoops, it was the fixed code you sent that worked, just to clarify.

      As for using plotTree.singletons, what I had in mind was using singleton nodes to distinguish stem from crown in highly simplified backbone trees. E.g., if I were to prune a bunch of recent clades down to a single tip each, it would in some cases still be useful to indicate where along that branch the transition from stem to crown occurred (one way to do this would be to use a thicker line for the "crown" portion). Just a thought I had - in my case it's very easy to do this with some quick post-hoc modification to the figure.

      Delete
    4. This would be pretty easy - but since, so far as I know, you are the only one using plotTree.singletons, we should probably just chat on the phone so that I know what you'd like it to do. Right now it is totally inflexible. - Liam

      Delete