## Thursday, September 8, 2011

### Lambda optimization in phyl.cca()

I just added joint λ optimization to the phylogenetic canonical correlation analysis function phyl.cca(). Direct link to code is here. This nearly matches up its functionality to my "pcca" program (here & publication here). The only thing still missing is the calculation of standardized canonical coefficients and canonical coefficients. The former are analogous to standardized regression coefficients; whereas the latter are correlations (in this case, phylogenetic correlations) between the canonical variables and the original measured traits. I will hopefully add these to a future version of the function.

Just to give a quick sense of λ to the uninitiated, λ is a scaling parameter for the off-diagonals of the "coancestry matrix" (here defined as N × N matrix containing the times above the root for each pair of N species) and the among species phenotypic correlation matrix. λ and the joint estimation of λ for multiple traits is best described in a paper by Rob Freckleton & colleages (2002). It provides a very simple way to relax the strict Brownian motion assumption of most phylogenetic analyses of this nature.

To illustrate use of this function, I will give the following example of simulating with λ = 0.75 and no correlations between the Xs & the Ys. To do this, you need to install & load "phytools," "geiger," & (naturally) their dependencies.

First, let's simulate a tree & data:

> tree<-drop.tip(birthdeath.tree(b=1,d=0,taxa.stop=101),"101")
> X<-fastBM(lambdaTree(tree,lambda=0.75),nsim=4)
> Y<-fastBM(lambdaTree(tree,lambda=0.75),nsim=3)

Here, we've simulated a random Yule tree with 100 taxa along with two sets of uncorrelated random variables.

Now, let's run phylogenetic CCA with joint estimation of λ:

> result<-phyl.cca(tree,X,Y,fixed=F)

Here's our estimate of λ:

> result\$lambda
[1] 0.7524329

Here's our canonical correlations & corresponding p-values:

> result\$cor
[1] 0.37263858 0.17331011 0.04891746
> result\$p
[1] 0.1377030 0.7930295 0.8924366

Now, let's try the analysis with fixed lambda of λ=1.0 (the default):

> result<-phyl.cca(tree,X,Y)
> result\$cor
[1] 0.5571312 0.2352854 0.1519402
> result\$p
[1] 2.316245e-05 2.665493e-01 3.297459e-01

And, now without the phylogeny at all (i.e., with a star tree):

> result<-phyl.cca(starTree(tree\$tip.label,rep(1,100)),X,Y)
> result\$cor
[1] 0.5505426 0.2995164 0.1657136
> result\$p
[1] 7.276668e-06 7.216184e-02 2.664303e-01

Although this is just a single example, it gives us a little cause for consternation because both the fixed λ and non-phylogenetic analyses show significant canonical correlations, even though we didn't simulate any!

Please let me know if you give the function a try.

1. Dear Liam,
I am trying your phyl.cca function but I keep getting the following error:
Error in ccs > 0 : invalid comparison with complex values
is this a format problem with the data matrices or something else is going on?
cheers
Arley

2. Hello Liam, just tried it with real data and got:

nonfixed lambda:
> pCCA\$cor
[1] 0.8592734 0.7229225 0.4918054
> pCCA\$p
[1] 0.009319754 0.277941825 0.732731644

fixed lambda (default)
> pCCA\$cor
[1] 0.8487658 0.7821672 0.4815591
> pCCA\$p
[1] 0.004401122 0.112495274 0.762967808
star phylogeny:
[1] 0.8794160 0.7020003 0.4563339

> pCCA\$p
[1] 0.007407126 0.412586504 0.828730628

Does it give you any light?
My challenge now is to plot these results in a meaningful way.

Cheers
Agus

1. Hi Agus.

Looks like you have close agreement between phylogenetic & non-phylogenetic analyses which is that you have one significant canonical correlation. I don't have any specific advice regarding plotting, but often people interpret CCA results using things such as the canonical loadings.

Good luck. Liam

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.