I just
updated
the phytools function force.ultrametric
to print the following message (unless explicitly
turned off by the user by setting the optional argument message=FALSE
):
## ***************************************************************
## * Note: *
## * force.ultrametric does not include a formal method to *
## * ultrametricize a tree & should only be used to coerce to *
## * a phylogeny that fails is.ultramtric due to rounding -- *
## * not as a substitute for formal rate-smoothing methods. *
## ***************************************************************
The reason that I decided that this warning was necessary because I've seen force.ultrametric
used in some papers in lieu of a formal rate-smoothing method for generating an ultrametric tree.
This is not appropriate.
What is force.ultrametric
for then? It's essentially a coercion method that forces a tree
that should be ultrametric, but that fails is.ultrametric
(perhaps, for example, due to
numerical precision or rounding of the edge lengths when it was written to file), to be so.
Here's what I mean.
Let's load the data object anoletree
that comes loaded with phytools.
library(phytools)
data(anoletree)
anoletree<-as.phylo(anoletree)
plotTree(anoletree,ftype="i",fsize=0.7,lwd=1)
Looks pretty ultrametric, right? It should be.
Think again, though – it's not!
is.ultrametric(anoletree)
## [1] FALSE
What's wrong? We can see if we pull out the total height of all the tips of the tree as follows:
h<-setNames(sapply(1:Ntip(anoletree),
nodeheight,tree=anoletree),
anoletree$tip.label)
h
## ahli allogus rubribarbus imias sagrei bremeri quadriocellifer ophiolepis
## 6 6 6 6 6 6 6 6
## mestrei jubar homolechis confusus guafe garmani opalinus grahami
## 6 6 6 6 6 6 6 6
## valencienni lineatopus evermanni stratulus krugi pulchellus gundlachi poncensis
## 6 6 6 6 6 6 6 6
## cooki cristatellus brevirostris caudalis marron websteri distichus alumina
## 6 6 6 6 6 6 6 6
## semilineatus olssoni insolitus whitemani haetianus breslini armouri cybotes
## 6 6 6 6 6 6 6 6
## shrevei longitibialis strahmi marcanoi baleatus barahonae ricordii cuvieri
## 6 6 6 6 6 6 6 6
## altitudinalis oporinus isolepis allisoni porcatus loysiana guazuma placidus
## 6 6 6 6 6 6 6 6
## sheplani alayoni angusticeps paternus alutaceus inexpectatus clivicola cupeyalensis
## 6 6 6 6 6 6 6 6
## cyanopleurus alfaroi macilentus vanidicus baracoae noblei smallwoodi luteogularis
## 6 6 6 6 6 6 6 6
## equestris bahorucoensis dolichocephalus hendersoni darlingtoni aliniger singularis chlorocyanus
## 6 6 6 6 6 6 6 6
## coelestinus occultus
## 6 6
Ok…. All the tree heights seem the same. But wait a sec….
options(digits=10)
h
## ahli allogus rubribarbus imias sagrei bremeri quadriocellifer ophiolepis
## 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994
## mestrei jubar homolechis confusus guafe garmani opalinus grahami
## 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994
## valencienni lineatopus evermanni stratulus krugi pulchellus gundlachi poncensis
## 5.99999994 5.99999994 5.99999994 5.99999994 6.00000000 6.00000000 5.99999994 5.99999994
## cooki cristatellus brevirostris caudalis marron websteri distichus alumina
## 5.99999994 5.99999994 5.99999994 5.99999988 5.99999988 5.99999994 5.99999994 5.99999994
## semilineatus olssoni insolitus whitemani haetianus breslini armouri cybotes
## 5.99999994 5.99999994 5.99999994 6.00000000 5.99999994 5.99999994 5.99999994 5.99999994
## shrevei longitibialis strahmi marcanoi baleatus barahonae ricordii cuvieri
## 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994
## altitudinalis oporinus isolepis allisoni porcatus loysiana guazuma placidus
## 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 6.00000000
## sheplani alayoni angusticeps paternus alutaceus inexpectatus clivicola cupeyalensis
## 6.00000000 6.00000000 6.00000000 6.00000000 5.99999994 5.99999994 5.99999994 5.99999994
## cyanopleurus alfaroi macilentus vanidicus baracoae noblei smallwoodi luteogularis
## 5.99999994 6.00000000 6.00000000 5.99999994 6.00000000 6.00000000 6.00000000 6.00000000
## equestris bahorucoensis dolichocephalus hendersoni darlingtoni aliniger singularis chlorocyanus
## 5.99999994 6.00000000 5.99999994 5.99999994 5.99999994 5.99999994 5.99999994 6.00000000
## coelestinus occultus
## 6.00000000 5.99999994
Now we can see that some of the tips have total height of 6.00000000
while others have
a total height of 5.99999994
. This is probably because one of the internal edges of the
tree was rounded downward by about 0.00000006 at some point – perhaps when the phylogeny
was written to or read from file.
We can fix this using force.ultrametric
as follows:
new.tree<-force.ultrametric(anoletree)
## ***************************************************************
## * Note: *
## * force.ultrametric does not include a formal method to *
## * ultrametricize a tree & should only be used to coerce to *
## * a phylogeny that fails is.ultramtric due to rounding -- *
## * not as a substitute for formal rate-smoothing methods. *
## ***************************************************************
Our new tree has different total height as follows:
h<-setNames(sapply(1:Ntip(new.tree),
nodeheight,tree=new.tree),
new.tree$tip.label)
h
## ahli allogus rubribarbus imias sagrei bremeri quadriocellifer ophiolepis
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## mestrei jubar homolechis confusus guafe garmani opalinus grahami
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## valencienni lineatopus evermanni stratulus krugi pulchellus gundlachi poncensis
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## cooki cristatellus brevirostris caudalis marron websteri distichus alumina
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## semilineatus olssoni insolitus whitemani haetianus breslini armouri cybotes
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## shrevei longitibialis strahmi marcanoi baleatus barahonae ricordii cuvieri
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## altitudinalis oporinus isolepis allisoni porcatus loysiana guazuma placidus
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## sheplani alayoni angusticeps paternus alutaceus inexpectatus clivicola cupeyalensis
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## cyanopleurus alfaroi macilentus vanidicus baracoae noblei smallwoodi luteogularis
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## equestris bahorucoensis dolichocephalus hendersoni darlingtoni aliniger singularis chlorocyanus
## 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959 5.999999959
## coelestinus occultus
## 5.999999959 5.999999959
It also now passes is.ultrametric
:
is.ultrametric(new.tree)
## [1] TRUE
In this case, we have a new total height of the tree that is the (weighted) average of our
two different previous heights. This is because the default method of force.ultrametric
is to use non-negative least squares to minimize the sum of squared difference between the
edge lengths of our tree and the edge lengths of an ultrametric tree with the same topology.
The implied patristic distances on these two trees should also be (virtually) identical:
D1<-cophenetic(anoletree)
D2<-cophenetic(new.tree)[rownames(D1),
colnames(D1)]
plot(D1,D2,bty="n",pch=21,cex=1.2,bg="grey",
xlab="patristic distances on original tree",
ylab="patristic distances on NNLS tree")
lines(c(0,12),c(0,12),lty="dotted")
legend("topleft","1:1 line",lty="dotted",bty="n")
If our tree comes from a method (like ML estimation using RAxML, for instance) that does
not result in an ultrametric tree, my recommendation would be to use a formal method such
as penalized likelihood (Sanderson
2002). In R, this is implemented in
the ape function chronos
.