A good long time ago now some co-authors & I (Lindenfors et al. 2010) published a 'phylogenetic paired t-test.' This is a really simple method in which we ask if a set of paired traits (say, male and female RBC counts; or summer & winter body mass) differ significantly one from the other across species, while taking into account non-independence due to the phylogeny. The method can also account for (known) uncertainty in either or both trait vectors, and can relax the assumption of pure Brownian motion for the correlation structure of the errors (using Pagel's λ model).

The method has been implemented in the function `phyl.pairedttest`

;
however a *phytools* user recently pointed out some issues with
optimizing the model (which uses likelihood). This turned out to be due to
trait scale. For instance, for (numerically) large mean differences between
the two trait vectors, the corresponding element of the Hessian matrix from
the likelihood surface was sometimes numerically indistinguishable from zero.

I fixed this by using an internal rescaling, which also then required
that I back-transform the estimated parameter by (some function of) the
scaling factor. I was also able to make other improvements and I added
an S3 `print`

method for the object class. All of my updates can
be seen
here.

The following is a simple demo using simulated data:

```
library(phytools)
packageVersion("phytools")
```

```
## [1] '0.6.46'
```

```
phylo.heatmap(tree,cbind(x1,x2,x3),fsize=c(0.5,0.9,0.9),
standardize=TRUE)
```

```
phyl.pairedttest(tree,x1,x2)
```

```
##
## Phylogenetic paired t-test:
##
## t = 0.52022, df = 57, p-value = 0.604925
##
## alternative hypothesis:
## true difference in means is not equal to 0
##
## 95 percent confidence interval on the phylogenetic
## difference in mean:
## [-0.999187, 1.72124]
##
## estimates:
## phylogenetic mean difference = 0.361026
## sig^2 of BM model = 0.879318
## lambda (fixed or estimated) = 1
##
## log-likelihood:
## -78.3204
```

```
phyl.pairedttest(tree,x1,x3)
```

```
##
## Phylogenetic paired t-test:
##
## t = -2.44073, df = 57, p-value = 0.0177879
##
## alternative hypothesis:
## true difference in means is not equal to 0
##
## 95 percent confidence interval on the phylogenetic
## difference in mean:
## [-2.95937, -0.323278]
##
## estimates:
## phylogenetic mean difference = -1.64132
## sig^2 of BM model = 0.826258
## lambda (fixed or estimated) = 0.998969
##
## log-likelihood:
## -77.2733
```

Note that the method should yield (nearly approximately) the same values
as a *standard* paired t-test, if we fix λ to zero. The only
difference is that we compute the standard error of the mean difference
from the Hessian matrix instead of analytically:

```
phyl.pairedttest(tree,x1,x2,lambda=0,fixed=TRUE)
```

```
##
## Phylogenetic paired t-test:
##
## t = 1.33211, df = 58, p-value = 0.188035
##
## alternative hypothesis:
## true difference in means is not equal to 0
##
## 95 percent confidence interval on the phylogenetic
## difference in mean:
## [-0.129139, 0.677096]
##
## estimates:
## phylogenetic mean difference = 0.273978
## sig^2 of BM model = 0.720126
## lambda (fixed or estimated) = 0
##
## log-likelihood:
## -113.078
```

```
t.test(x1,x2,paired=TRUE,var.equal=TRUE)
```

```
##
## Paired t-test
##
## data: x1 and x2
## t = 1.321, df = 59, p-value = 0.1916
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.1410439 0.6890009
## sample estimates:
## mean of the differences
## 0.2739785
```

In this example, the data were simulated with a genuine difference between
`x1`

and `x3`

, but not between `x1`

and
`x2`

as follows:

```
tree<-pbtree(n=60)
x1<-fastBM(tree)
x2<-x1+fastBM(tree)
x3<-x1+fastBM(tree,a=2)
```

That's it.

Dear Liam,

ReplyDeleteI had posted earlier about a question about the phylogenetic paired t-test, and with your help I got it sorted (I think, in the end it was also partly a problem with incompatibility of different versions of R and the package, which I could not resolve working on an office computer earlier). Now everything is running fine, but I am a bit surprised by the outcome, fearing that I make some mistake in my settings. I am looking at body masses of male primates (27 species) from captive or wild populations. When I calculate the lambda, I get a very large lambda, which is highly significantly different from 0.

> (signal_C<-phylosig(baum, captive, method="lambda", test=TRUE))

Phylogenetic signal lambda : 0.999934

logL(lambda) : -14.3279

LR(lambda=0) : 40.3972

P-value (based on LR test) : 2.07242e-10

When I then run the paired test using the "Fixed=FALSE" option, I still get a test-outcome that seems to suggest that lambda is taken as zero (yielding overall significant differences between captive and wild body masses, which in itself is not surprising).

> lambda<-signal_C$lambda

>

> phyl.pairedttest(tree=baum,wild,captive, lambda=lambda, fixed=FALSE)

Phylogenetic paired t-test:

t = -4.10251, df = 24, p-value = 0.000406537

alternative hypothesis:

true difference in means is not equal to 0

95 percent confidence interval on the phylogenetic

difference in mean:

[-0.086297, -0.0304976]

estimates:

phylogenetic mean difference = -0.0583973

sig^2 of BM model = 6.27146e-05

lambda (fixed or estimated) = 0

log-likelihood:

32.0024

In contrast, if I use the Fixed=TRUE option, and lambda is forced to be 0.99, then the difference between the means is not deemed to be statistically different.

> phyl.pairedttest(tree=baum,wild,captive, lambda=lambda, fixed=TRUE)

Phylogenetic paired t-test:

t = -0.395143, df = 25, p-value = 0.696089

alternative hypothesis:

true difference in means is not equal to 0

95 percent confidence interval on the phylogenetic

difference in mean:

[-0.302461, 0.200968]

estimates:

phylogenetic mean difference = -0.0507465

sig^2 of BM model = 0.00073346

lambda (fixed or estimated) = 0.999934

log-likelihood:

18.1789

So now I am concerned about how lambda is estimated when Fixed=FALSE (which seems intuitively the more appropriate option) and that it does not what I think it does, i.e. to use the data to estimate whether phylogeny is important to take into account.

Sorry for the length of this - but could you tell me whether I could be reasonably confident about my fixed=FALSE approach, when the original lambda estimate was significiantly different from 0?