Today on R-sig-phylo a reader asked:
I have a question conserning the pgls regression in package caper.
The function allows to estimate or fix three branch length transformations.
I wanna figure out which transformation gives me the best model fit by
comparing for example a model with lambda estimated (lambda=“ML”) to a model
where I fix lambda at 0.5 (lambda=0.5). However, if I use
anova(model1, model2)
to compare the two models I get the error
message that the two models are run with different branch length
transformations, which makes sense. But is there any other possbility to
compare models with different branch length transformations? How do I know
which model is better?
This is what I responded:
We can compare two fitted models, one in which lambda is estimated and another in which lambda is fixed (at 0, 1, or some other arbitrary value) using a likelihood ratio test. This is what that would look like:
[First, let's simulate some data:
library(phytools)
tree<-pbtree(n=50)
x<-fastBM(tree)
y<-0.75*x+fastBM(phytools:::lambdaTree(tree,0.5),sig2=0.2)
DATA<-data.frame(x=x,y=y)
Done.]
## fit models
## DATA is data frame containing x & y
## tree is object of class "phylo"
library(nlme)
fitBM<-gls(y~x,data=DATA,correlation=corBrownian(1,tree),
method="ML")
fitBM
## Generalized least squares fit by maximum likelihood
## Model: y ~ x
## Data: DATA
## Log-likelihood: -76.21
##
## Coefficients:
## (Intercept) x
## 0.1209 0.7842
##
## Correlation Structure: corBrownian
## Formula: ~1
## Parameter estimate(s):
## numeric(0)
## Degrees of freedom: 50 total; 48 residual
## Residual standard error: 2.159
fitLambda<-gls(y~x,data=DATA,correlation=corPagel(1,tree,fixed=FALSE),
method="ML")
fitLambda
## Generalized least squares fit by maximum likelihood
## Model: y ~ x
## Data: DATA
## Log-likelihood: -52.62
##
## Coefficients:
## (Intercept) x
## 0.06996 0.80184
##
## Correlation Structure: corPagel
## Formula: ~1
## Parameter estimate(s):
## lambda
## 0.3873
## Degrees of freedom: 50 total; 48 residual
## Residual standard error: 0.7515
## function for likelihood ratio test
lrtest<-function(model1,model2){
lik1<-logLik(model1)
lik2<-logLik(model2)
LR<--2*(lik1-lik2)
degf<-attr(lik2,"df")-attr(lik1,"df")
P<-pchisq(LR,df=degf,lower.tail=FALSE)
cat(paste("Likelihood ratio = ",
signif(LR,5),"(df=",degf,") P = ",
signif(P,4),"\n",sep=""))
invisible(list(likelihood.ratio=LR,p=P))
}
## run likelihood-ratio test
lrtest(fitBM,fitLambda)
## Likelihood ratio = 47.176(df=1) P = 6.489e-12
I also noted that for small trees we may want to substitute a distribution obtained via simulation for the parametric χ2 distribtution; however I neglected to comment that only nested models can be compared in this way.
That's it.
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.