A colleague recently asked me the following question titled:
“Simulating trees with different pybus & Harvey's γ.”
“This may be a question with a very obvious answer, but is there a way of simulating Yule trees so they are either early burst or late burst (i.e. can you create simulated trees with a high or low γ)? Trying to satisfy a reviewer’s request to test our taxonomy paper against null models.”
Well, we can't precisely simulate constant-rate pure-birth trees that have a particular γ value; however, one thing we could do is transform the edge lengths of the tree so that has a particular, predictable value of γ. In fact, it turns out to be the case - at least so far as I can tell (though I don't have a formal proof of this) that for any tree & particular value of γ, there is a corresponding “EB” (early-burst or ACDC, see Blomberg et al. 2003) edge-length transformation that will produce that very-same γ. It's just a matter of finding it!
Turns out - that's pretty easy. We can use numerical optimization, and since we only need to optimize in one-dimension, as the EB transformation only has one paramter, r, optimization is both fast & robust.
For instance:
library(phytools)
## simulate a tree
tree<-pbtree(n=100)
ltt(tree,show.tree=TRUE,log.lineages=F,log="y")
## Object of class "ltt" containing:
##
## (1) A phylogenetic tree with 100 tips and 99 internal nodes.
##
## (2) Vectors containing the number of lineages (ltt) and branching times (times) on the tree.
##
## (3) A value for Pybus & Harvey's "gamma" statistic of 2.0956, p-value = 0.0361.
## transform to gamma=0
ebTree<-phytools:::ebTree
gamma<-function(r,tree,g)
(g-ltt(ebTree(tree,r),plot=FALSE)$gamma)^2
fit<-optimize(gamma,c(-10,10),tree=tree,g=0,tol=1e-12)
ltt(ebTree(tree,fit$minimum),show.tree=TRUE,log.lineages=F,log="y",
lwd=2)
## Object of class "ltt" containing:
##
## (1) A phylogenetic tree with 100 tips and 99 internal nodes.
##
## (2) Vectors containing the number of lineages (ltt) and branching times (times) on the tree.
##
## (3) A value for Pybus & Harvey's "gamma" statistic of 0, p-value = 1.
## or how about gamma=10
fit<-optimize(gamma,c(-10,10),tree=tree,g=10,tol=1e-12)
ltt(ebTree(tree,fit$minimum),show.tree=TRUE,log.lineages=F,log="y",
lwd=2)
## Object of class "ltt" containing:
##
## (1) A phylogenetic tree with 100 tips and 99 internal nodes.
##
## (2) Vectors containing the number of lineages (ltt) and branching times (times) on the tree.
##
## (3) A value for Pybus & Harvey's "gamma" statistic of 10, p-value = 0.
## or gamma=-10
fit<-optimize(gamma,c(-10,10),tree=tree,g=-10,tol=1e-12)
ltt(ebTree(tree,fit$minimum),show.tree=TRUE,log.lineages=F,log="y",
lwd=2)
## Object of class "ltt" containing:
##
## (1) A phylogenetic tree with 100 tips and 99 internal nodes.
##
## (2) Vectors containing the number of lineages (ltt) and branching times (times) on the tree.
##
## (3) A value for Pybus & Harvey's "gamma" statistic of -10, p-value = 0.
Note that the ebTree
transformation, though it does
produce the correct γ, has the side-effect of changing the total
depth of the tree. This needn't be the case. For instance, we can just
do the following:
## custom EB function to hold total length constant
EB<-function(tree,r){
d<-max(nodeHeights(tree))
tree<-ebTree(tree,r)
tree$edge.length<-tree$edge.length/max(nodeHeights(tree))*d
tree
}
gamma<-function(r,tree,g)
(g-ltt(EB(tree,r),plot=FALSE)$gamma)^2
## try with gamma=-4
fit<-optimize(gamma,c(-10,10),tree=tree,g=-4,tol=1e-12)
ltt(EB(tree,fit$minimum),show.tree=TRUE,log.lineages=F,log="y",
lwd=2)
## Object of class "ltt" containing:
##
## (1) A phylogenetic tree with 100 tips and 99 internal nodes.
##
## (2) Vectors containing the number of lineages (ltt) and branching times (times) on the tree.
##
## (3) A value for Pybus & Harvey's "gamma" statistic of -4, p-value = 1e-04.
OK, now let's generate 100 trees with total length of 1.0 and γ=-1:
trees<-pbtree(n=100,scale=1,nsim=100)
foo<-function(tree,g)
optimize(gamma,c(-10,10),tree=tree,g=g,tol=1e-12)$minimum
r<-sapply(trees,foo,g=-1)
ttrees<-mapply(EB,tree=trees,r=r,SIMPLIFY=FALSE)
class(ttrees)<-"multiPhylo"
g<-sapply(ttrees,function(x) ltt(x,plot=FALSE)$gamma)
g
## [1] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
## [24] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
## [47] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
## [70] -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
## [93] -1 -1 -1 -1 -1 -1 -1 -1
ltt(ttrees,col=make.transparent("blue",0.25))
## 100 objects of class "ltt" in a list
What's neat is that for our single fixed value of γ = -1.0, the LTT curves vary quite a lot!
[Note. In the original post I called ltt(trees,...)
rather
than ltt(ttrees,...)
- that is, ltt
on the
original, rather than transformed trees. Shown above this is fixed, but
the LTT plots still vary quite a bit.]
That's it. Hopefully of some help.
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.