Tuesday, August 16, 2016

"Fixing" an ultrametric tree whose edges are not ultrametric due to numerical precision / rounding

A R-sig-phylo subscriber posted the following question:

“I want to do some diversification pattern analyses with various R packages. I'm using a time-calibrated tree produced by PhyloBayes. Branch lengths are in millions of years and all taxa are extant, so the tree should be ultrametric. However, when I call is.ultrametric, it returns FALSE. Has anybody encountered something like this? Any ideas about what's going on/how to solve this?”

(Followed by some more details about check failures in different R analyses.)

My response was as follows:

“Since you are writing & reading trees to file, my guess is that it has to do with numerical precision - that is, the rounding of your edge lengths when they are written to file.

"Does your tree look ultrametric when plotted in R? If so, this is probably the case.

My recommendation is that you use phangorn to compute the non-negative least-squares edge lengths with the condition that the tree is ultrametric. This will give you the edge lengths that result in the distances between taxa with minimum sum of squared differences from the distances implied by your input tree, under the criterion that the resulting tree is ultrametric.”

Let's try this out using a simulated ultrametric tree that has had it's edges rounded - as if during writing to / reading from file:

library(phytools)
library(phangorn)
## simulate tree
tree<-otree<-pbtree(n=400,scale=100)
tree<-roundBranches(tree,1) ## round edges
## original tree ultrametric
is.ultrametric(otree)
## [1] TRUE
## rounded tree is not
is.ultrametric(tree)
## [1] FALSE
plotTree(tree,ftype="off",lwd=1) ## even though it looks ultrametric

plot of chunk unnamed-chunk-1

## compute the NNLS ultrametric tree
nnls<-nnls.tree(cophenetic(tree),tree,rooted=TRUE)
## check
is.ultrametric(nnls)
## [1] TRUE
plotTree(nnls,ftype="off",lwd=1)

plot of chunk unnamed-chunk-1

Finally, we can compare our implied distances between the trees:

tips<-tree$tip.label
plot(cophenetic(otree)[tips,tips],cophenetic(tree)[tips,tips],
    xlab="original distances",ylab="distances on rounded tree")

plot of chunk unnamed-chunk-2

cor(as.vector(cophenetic(otree)[tips,tips]),
    as.vector(cophenetic(tree)[tips,tips]))
## [1] 0.9999941
plot(cophenetic(otree)[tips,tips],cophenetic(nnls)[tips,tips],
    xlab="original distances",ylab="distances on NNLS tree")

plot of chunk unnamed-chunk-2

cor(as.vector(cophenetic(otree)[tips,tips]),
    as.vector(cophenetic(nnls)[tips,tips]))
## [1] 0.9999995

No comments:

Post a Comment