Because it related to something else I was working on, I today did a quick comparison of the (at least 4) different methods available in R to fit Pagel's λ model for a single trait. I just did this for a single 200 taxon tree and compared the estimates (which are all, to reasonable numerical precision, the same) and the computation time. A more thorough comparison would require that we run this code over multiple simulations to try and determine the robustness of each function. This I have not done, but anyone wishing to loop the following analyses 100 or more times should be my guest. This was run on my VAIO AMD E-450 laptop @ 1.65 GHz (so, not very fast):
> require(phytools); require(geiger); require(caper); require(nlme)
> tree<-pbtree(n=200,scale=1) # simulate tree
> x<-fastBM(lambdaTree(tree,0.7)) # simulate data
> # fit using phytools:phylosig
> system.time(m1<-phylosig(tree,x,method="lambda"))
user system elapsed
2.79 0.00 2.79
> # fit using geiger:fitContinuous
> system.time(m2<-fitContinuous(tree,x,model="lambda"))
Fitting lambda model:
user system elapsed
137.11 0.42 138.90
> # fit using nlme:gls
> d<-data.frame(x)
> system.time(m3<-gls(x~1,data=d,correlation=corPagel(0.5,tree,fixed=F), method="ML"))
user system elapsed
51.56 0.22 53.86
> # fit using caper:pgls
> d<-data.frame(x,names(x)); colnames(d)[2]<-"Species"
> system.time(m4<-pgls(x~1,comparative.data(tree,d,"Species"), lambda="ML"))
user system elapsed
38.13 0.01 38.25
We can then get a summary as follows:
> result<-matrix(c(m1$lambda,m1$logL,
m2$Trait1$lambda,m2$Trait1$lnl,
m3$modelStruct$corStruct[1],m3$logLik,
m4$param["lambda"],m4$model$log.lik),4,2,
dimnames=list(c("phylosig","fitContinuous","gls","pgls"),
c("lambda","logL")),byrow=T)
> result
lambda logL
phylosig 0.7478915 -230.7348
fitContinuous 0.7478919 -230.7348
gls 0.7478922 -230.7348
pgls 0.7478922 -230.7348
So, at least for this example, phylosig is quite a bit faster, but it makes no discernible difference in the results. Why this would be the case is not at all clear to me (phylosig uses the built in optimizer, optimize, and performs full ML not REML), but I hope that it does not come at the cost of robust estimation. Certainly this could be assessed by repeating the above analysis for many simulations.
One possible explanation for the faster run time of phylosig that I thought of after writing this post is that I use the analytically derived solutions for σ^2 & the root value, conditioned on λ, so that ML optimization is a univariate rather than a multivariate maximization problem. This produces the same parameter estimates & likelihood but should be (theoretically and, evidently, in practice) much quicker.
ReplyDeleteHi,
ReplyDeleteYour code is impressively fast! But I think the calculation can be done even faster - using Felsenstein's (1973) algorithm one can avoid creating the variance covariance matrix or having to invert it. Using that algorithm (here called 'fast') I get the following:
lambda logL Time
phylosig 0.7114102 -220.1066 0.693
fitContinuous 0.7114085 -220.1066 39.770
gls 0.7114125 -220.1066 14.666
pgls 0.7114119 -220.1066 23.730
fast 0.7114101 -220.1066 0.141
The differences are exacerbated using larger trees, e.g. here with 1000 tips:
lambda logL Time
phylosig 0.7463478 -1109.464 32.126
fitContinuous 0.7463439 -1109.464 1146.695
gls 0.7463487 -1109.464 380.447
pgls 0.7463473 -1109.464 641.631
fast 0.7463478 -1109.464 0.801
I wonder if the other methods are performing un-necessary computation with respect to forming the variance matrix?
Rob
This is fantastic Rob.
ReplyDeleteYes, as I alluded, I was initially perplexed as to why phylosig was faster (as I made no attempt to be computationally economic) but then I realized it is because it does only univariate optimization, using the conditional MLEs of sigma^2 and the root state.
An obvious alternative is to do REML estimation of λ based on contrasts; however Rob's suggestion, to use the algorithm of Felsenstein (1973), is even better. This could certainly be used to speed up likelihood calculation for other models as well.
For those not familiar with this article, I recently discovered that it is available online, evidently for free - here. The algorithm Rob is describing begins on p. 476, I believe.