Saturday, November 17, 2012

Testing for Pagel's λ < 1.0

A recent commenter on the phytools blog asks:
I’m working on a comparative phylogenetic project and want to show that my data has phylogenetic signal. Using Pagel’s lambda I get lambda – 0.854. Is this value significant enough in that it’s safe to model the variable as Brownian motion?

I interpret this to mean "how can we test the alternative hypothesis that Pagel's λ is less then 1.0; against the null that λ=1.0." This is easy if we accept the contention that our likelihood ratio should be distributed as a Χ2 with 1 degree of freedom (for one parameter, λ, estimated in the more complex model), an asymptotic property of likelihoods. (There may be some very good reasons - which we'll set aside for the moment - to question this assertion, for instance see a recent article by Carl Boettiger et al.)

This can be done most naturally using the fitContinuous function in 'geiger' - but we can also do it using entirely phytools functions, and since I developed and maintain phytools (not geiger), I'll demo that option.

First, let's simulate a tree & data for the demo, using the λ=0.854 obtained by the commenter. We will use geiger::lambdaTree for this part:
> library(phytools); library(geiger)
> # simulate tree
> tree<-pbtree(n=50)
> # simulate data with lambda=0.854
> x<-fastBM(lambdaTree(tree,0.854))

OK, now let's fit both models, compute the likelihood ratio, and then compare it to a Χ2 distribution with one degree of freedom. We need to use a little bit of a work-around to fit the BM model using brownie.lite:
> # fit lambda model
> lambda<-phylosig(tree,x,method="lambda")
> lambda
[1] 0.9497926
[1] -87.54412
> # get likelihood for bm model
> tree$mapped.edge<-matrix(tree$edge.length, nrow(tree$edge),1,dimnames=list(NULL,"1"))
> bm.logL<-brownie.lite(tree,x)$logL1
> # conduct hypothesis test using chi-square
> LR<--2*(bm.logL-lambda$logL)
> P<-pchisq(LR,df=1,lower.tail=F)
> P
[1] 0.1346832

So, in this simulated case at least, we cannot distinguish our estimated phylogenetic signal from Brownian motion.

An additional word of caution, however. Sampling error in the estimation of species means can also have the effect of depressing phylogenetic signal - so if the species means are estimated with a limited number of samples per species, it might be wise to conduct this analysis including estimated within species sampling variances. These can be supplied using the argument se in both phylosig and brownie.lite.


  1. Thanks Liam, This is very helpful!

    1. Hi again,

      I just have a couple of short questions --

      By sampling error do you mean the "se" of the value of the trait(s) of each species?

      So in the case above, we would not reject the NULL hypothesis, because LR is 15? Because p value is not that small?


    2. Yes - "se" is the standard error of the mean for each species. This is normally calculated for each species as the within species SD divided by the square root of the sample size.

      In the case above, we would not reject the null - but the LR was not 15 (as you say) in this particular stochastic simulation. (The LR for a P of 0.135 would be around 2.24, or so. Since this is a simulation - individual results will vary.) The rejection level for alpha=0.05 with df=1 is about 3.84.

      - Liam

    3. Hi Liam,

      I ran the same analysis on another tree (190 species after pruning). My results are LAMBDA = 0.9109, LR = 7.06 and p = 0.0078.

      For an upper tail one sided test we would reject the null hypothesis that Lambda = 1 for alpha = 0.05 and 0.01. We would not reject the null hypothesis that lambda = 1 for alpha = 0.001.

      Is this the right interpretation?

  2. Hi Liam, to follow up on this interesting thread, what would be your suggestion to test the other end of the spectrum, that lambda is (or not) significantly different from 0? Would a comparison with a white noise model do the trick? Or using lambdaTree() in geiger to transform the phylogeny to a star tree?
    I enjoy the blog!

    1. Hi Alejandro. This is the default option in phylosig(...,method="lambda",test=TRUE). So, for tree and data in x:


      BTW, this indeed gives the same likelihood as the so-called "white noise" model of fitContinuous in the geiger package.

      - Liam