Saturday, November 30, 2013

Difference between gammaStat & ltt when the tree is slightly non-ultrametric

A phytools user reported a small, but measurable, difference in Pybus & Harvey's γ computed by gammaStat in ape and ltt in the phytools package. Since I know that ltt and gammaStat return the same value of γ for trees that are genuinely ultrametric, my guess was that the tree might not be precisely ultrametric. In fact, this turns out to be the case. Even though the tree passed the check is.ultrametric, trees can only be ultrametric to the degree of numerical precision that can be represented in the computer, and trees that are read in from file will generally be still considerably more non-ultrametric than that (due to rounding in the input file).

The user was kind enough to send me her tree so I was able to verify that this was indeed the case. The tree passes the check of is.ultrametric, but as we decrease tol, it eventually fails. Furthermore, it turns out to be the case that γ reported by gammaStat for a slightly non-ultrametric tree is exactly the same as a precisely ultrametric tree where all the tips have the same height as the highest tip; and γ reported by ltt for a slightly non-ultrametric tree is (nearly) exactly the same as for a precisely ultrametric tree where all the tips have the same height as the lowest tip. Here's how I found out (using the problematic tree):

> tree<-read.nexus("Trees.nex")
> gammaStat(tree)
[1] 2.320419
> ltt(tree,plot=FALSE)$gamma
[1] 2.321524
>
> ## compute the heights of all tips
> H<-nodeHeights(tree)
> h<-sapply(1:length(tree$tip.label),function(ii,h,e) h[which(e==ii)],h=H[,2],e=tree$edge[,2])
>
> ## the difference from the max height
> d<-max(h)-h
> ## add each difference to the tip
> for(i in 1:length(d))
+ tree$edge.length[which(tree$edge[,2]==i)] <- tree$edge.length[which(tree$edge[,2]==i)]+d[i]
>
> ## now they're the same & equal to gammaStat
> gammaStat(tree)
[1] 2.320419
> ltt(tree,plot=FALSE)$gamma
[1] 2.320419
>
> ## start again
> tree<-read.nexus("Trees.nex")
> gammaStat(tree)
[1] 2.320419
> ltt(tree,plot=FALSE)$gamma
[1] 2.321524
>
> ## the difference from the min
> d<-min(h)-h
> ## add each difference to the tip
> for(i in 1:length(d))
+ tree$edge.length[which(tree$edge[,2]==i)] <- tree$edge.length[which(tree$edge[,2]==i)]+d[i]
>
> ## now they're the same & equal to ltt
> gammaStat(tree)
[1] 2.321529
> ltt(tree,plot=FALSE)$gamma
[1] 2.321529

I just posted a new version of ltt that first checks that the tree is ultrametric & then if it passes is.ultrametric then it corrects the tree to be precisely ultrametric so that the computed γ lines up with gammaStat.

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.