Phylogenetic Tools for Comparative Biology: Significant update to phyl.pairedttest for phylogenetic paired t-test in R

Tuesday, November 28, 2017

Significant update to `phyl.pairedttest` for phylogenetic paired t-test in R

A good long time ago now some co-authors & I (Lindenfors et al. 2010) published a 'phylogenetic paired t-test.' This is a really simple method in which we ask if a set of paired traits (say, male and female RBC counts; or summer & winter body mass) differ significantly one from the other across species, while taking into account non-independence due to the phylogeny. The method can also account for (known) uncertainty in either or both trait vectors, and can relax the assumption of pure Brownian motion for the correlation structure of the errors (using Pagel's λ model).

The method has been implemented in the function phyl.pairedttest; however a phytools user recently pointed out some issues with optimizing the model (which uses likelihood). This turned out to be due to trait scale. For instance, for (numerically) large mean differences between the two trait vectors, the corresponding element of the Hessian matrix from the likelihood surface was sometimes numerically indistinguishable from zero.

I fixed this by using an internal rescaling, which also then required that I back-transform the estimated parameter by (some function of) the scaling factor. I was also able to make other improvements and I added an S3 print method for the object class. All of my updates can be seen here.

The following is a simple demo using simulated data:

library(phytools)
packageVersion("phytools")

## [1] '0.6.46'

phylo.heatmap(tree,cbind(x1,x2,x3),fsize=c(0.5,0.9,0.9),
    standardize=TRUE)

plot of chunk unnamed-chunk-1

phyl.pairedttest(tree,x1,x2)

## 
## Phylogenetic paired t-test:
## 
##    t = 0.52022, df = 57, p-value = 0.604925
## 
## alternative hypothesis:
##    true difference in means is not equal to 0
## 
## 95 percent confidence interval on the phylogenetic
## difference in mean:
##    [-0.999187, 1.72124]
## 
## estimates:
##    phylogenetic mean difference = 0.361026 
##    sig^2 of BM model = 0.879318 
##    lambda (fixed or estimated) = 1 
## 
## log-likelihood:
##     -78.3204

phyl.pairedttest(tree,x1,x3)

## 
## Phylogenetic paired t-test:
## 
##    t = -2.44073, df = 57, p-value = 0.0177879
## 
## alternative hypothesis:
##    true difference in means is not equal to 0
## 
## 95 percent confidence interval on the phylogenetic
## difference in mean:
##    [-2.95937, -0.323278]
## 
## estimates:
##    phylogenetic mean difference = -1.64132 
##    sig^2 of BM model = 0.826258 
##    lambda (fixed or estimated) = 0.998969 
## 
## log-likelihood:
##     -77.2733

Note that the method should yield (nearly approximately) the same values as a standard paired t-test, if we fix λ to zero. The only difference is that we compute the standard error of the mean difference from the Hessian matrix instead of analytically:

phyl.pairedttest(tree,x1,x2,lambda=0,fixed=TRUE)

## 
## Phylogenetic paired t-test:
## 
##    t = 1.33211, df = 58, p-value = 0.188035
## 
## alternative hypothesis:
##    true difference in means is not equal to 0
## 
## 95 percent confidence interval on the phylogenetic
## difference in mean:
##    [-0.129139, 0.677096]
## 
## estimates:
##    phylogenetic mean difference = 0.273978 
##    sig^2 of BM model = 0.720126 
##    lambda (fixed or estimated) = 0 
## 
## log-likelihood:
##     -113.078

t.test(x1,x2,paired=TRUE,var.equal=TRUE)

## 
##  Paired t-test
## 
## data:  x1 and x2
## t = 1.321, df = 59, p-value = 0.1916
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1410439  0.6890009
## sample estimates:
## mean of the differences 
##               0.2739785

In this example, the data were simulated with a genuine difference between x1 and x3, but not between x1 and x2 as follows:

tree<-pbtree(n=60)
x1<-fastBM(tree)
x2<-x1+fastBM(tree)
x3<-x1+fastBM(tree,a=2)

That's it.

1 comment:

MarenNovember 17, 2022 at 2:55 AM
Dear Liam,
I had posted earlier about a question about the phylogenetic paired t-test, and with your help I got it sorted (I think, in the end it was also partly a problem with incompatibility of different versions of R and the package, which I could not resolve working on an office computer earlier). Now everything is running fine, but I am a bit surprised by the outcome, fearing that I make some mistake in my settings. I am looking at body masses of male primates (27 species) from captive or wild populations. When I calculate the lambda, I get a very large lambda, which is highly significantly different from 0.
> (signal_C<-phylosig(baum, captive, method="lambda", test=TRUE))

Phylogenetic signal lambda : 0.999934
logL(lambda) : -14.3279
LR(lambda=0) : 40.3972
P-value (based on LR test) : 2.07242e-10

When I then run the paired test using the "Fixed=FALSE" option, I still get a test-outcome that seems to suggest that lambda is taken as zero (yielding overall significant differences between captive and wild body masses, which in itself is not surprising).
> lambda<-signal_C$lambda
>
> phyl.pairedttest(tree=baum,wild,captive, lambda=lambda, fixed=FALSE)

Phylogenetic paired t-test:

t = -4.10251, df = 24, p-value = 0.000406537

alternative hypothesis:
true difference in means is not equal to 0

95 percent confidence interval on the phylogenetic
difference in mean:
[-0.086297, -0.0304976]

estimates:
phylogenetic mean difference = -0.0583973
sig^2 of BM model = 6.27146e-05
lambda (fixed or estimated) = 0

log-likelihood:
32.0024

In contrast, if I use the Fixed=TRUE option, and lambda is forced to be 0.99, then the difference between the means is not deemed to be statistically different.
> phyl.pairedttest(tree=baum,wild,captive, lambda=lambda, fixed=TRUE)

Phylogenetic paired t-test:

t = -0.395143, df = 25, p-value = 0.696089

alternative hypothesis:
true difference in means is not equal to 0

95 percent confidence interval on the phylogenetic
difference in mean:
[-0.302461, 0.200968]

estimates:
phylogenetic mean difference = -0.0507465
sig^2 of BM model = 0.00073346
lambda (fixed or estimated) = 0.999934

log-likelihood:
18.1789

So now I am concerned about how lambda is estimated when Fixed=FALSE (which seems intuitively the more appropriate option) and that it does not what I think it does, i.e. to use the data to estimate whether phylogeny is important to take into account.
Sorry for the length of this - but could you tell me whether I could be reasonably confident about my fixed=FALSE approach, when the original lambda estimate was significiantly different from 0?
ReplyDelete
Replies

Add comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.

Pages

Tuesday, November 28, 2017

Significant update to phyl.pairedttest for phylogenetic paired t-test in R

1 comment:

Significant update to `phyl.pairedttest` for phylogenetic paired t-test in R