## Wednesday, December 5, 2012

### Hypothesis testing using fitDiversityModel

A user had this question about using the function fitDiversityModel (which implements an ML version of the method in Mahler et al. (2010) to test hypotheses about phenotypic diversification:

I have one short question. In the paper by L. Mahler & al. they also tested the model with a single evolutionary rate. I’m asking me if it is “enough” to write it like that: Constant.time.results <- fitDiversityModel(tree,x,d=NULL)

The answer is "no" - but it is very easy to conduct the desired test. If fitDiversity(...,d=NULL), then the function estimates lineage density at each node as if the radiation is occurring within a single region (and there is no extinction or missing taxa - but that is a matter for another day); and then fits a model in which the rate of evolution varies as a function of lineage density. This will generally not be our null model. For that, we should specify d to be a vector of zeroes. We can then use a likelihood ratio test (for example) to test the null hypothesis that the pace of evolution has declined with increasing lineage density.

Here's a worked example, simulated so that the null is correct:
> require(phytools)
> # simulate tree & data under the null
> tree<-pbtree(n=100)
> x<-fastBM(tree)
> # fit the lineage-density model assuming a single region
> # (i.e., all contemporary ancestors were competitors)
> fitLD<-fitDiversityModel(tree,x)
no values for lineage density provided; computing assuming single biogeographic region
> fitLD
\$logL
 -126.7954
\$sig0
 1.171704
\$psi  -0.004116475
\$vcv
sig0           psi
sig0  0.121592376 -1.833989e-03
psi  -0.001833989  3.240988e-05

> # now fit the null hypothesis of constant rate
> d<-rep(0,tree\$Nnode); names(d)<-1:tree\$Nnode+length(tree\$tip)
> fitNull<-fitDiversityModel(tree,x,d)
psi not estimable because diversity is constant through time.
> fitNull
\$logL
 -127.0581
\$sig0
 0.9602251
\$vcv
sig0
sig0 0.01844064

> # now test using LR
> LR<-2*(fitLD\$logL-fitNull\$logL)
> P<-pchisq(LR,df=1,lower.tail=F)
> P
 0.4685722

The null model is just pure Brownian motion; and we would obtain the same parameter estimate and likelihood using, say, fitContinuous in the geiger package:
> fitContinuous(tree,x)
Fitting BM model:
\$Trait1
\$Trait1\$lnl
 -127.0581
\$Trait1\$beta
 0.9602253
\$Trait1\$aic
 258.1161
\$Trait1\$aicc
 258.2411
\$Trait1\$k
 2

That's it.