Thursday, May 28, 2015

Phylogenetic canonical correlation analysis using contrasts

phytools has a function, described in a paper by myself & Alexis Harrison, that does phylogenetic canonical correlation analysis in a phylogenetic context.

It is possible to do the same analysis, that is returning the same correlations & proportional coefficients, if we just first compute Felsenstein's contrasts, and then perform an uncentered canonical correlationa analysis on the contrasts. Here is a quick demo of how we go about doing this for data contained in matrices X and Y, and phylogeny tree.

library(phytools)

cca1<-cancor(apply(X,2,pic,phy=tree),apply(Y,2,pic,phy=tree),
    xcenter=FALSE,ycenter=FALSE)
cca1$cor
##  [1] 0.9496915 0.9099798 0.8602210 0.8278812 0.8035111 0.7906251 0.7601731
##  [8] 0.7348214 0.6872831 0.5698925 0.5605435 0.4835210 0.4116497 0.2441289
cca2<-phyl.cca(tree,X,Y)
cca2$cor
##  [1] 0.9496915 0.9099798 0.8602210 0.8278812 0.8035111 0.7906251 0.7601731
##  [8] 0.7348214 0.6872831 0.5698925 0.5605435 0.4835210 0.4116497 0.2441289

Just to verify that the correlations are the same and that the coefficients are proportional, let's quickly plot them:

## this is 1:1
plot(cca1$cor,cca2$cor)

plot of chunk unnamed-chunk-2

## this is 1:1 or 1:-1, because the sign of the coefficients for
## any canonical axis is arbitrary
## (in addition, be aware that the scale between phylogenetic &
## non-phylogenetic analyses will probably differ) 
plot(cca1$xcoef,cca2$xcoef)

plot of chunk unnamed-chunk-2

The main advantage of phyl.cca is that it also returns scores in the original space, and it automatically runs hypothesis tests on the canonical correlations. There are other canonical correlation functions to do this in R, but they do not permit the data to be treated as centered, which means that they cannot be used with contrasts.

This post is based on a user query about how to do this, by the way.

That's all there is to it.

1 comment:

  1. Dear Liam,

    I performed a phyl.cca, and tried to retract the correlation between both matrixes, using the 'comput' function:

    phylpca<-phyl.pca(tree,x,y)
    phyl_cor<-comput(x,y,phylpca)

    Alas, I always get an error: "— 'y' must be numeric — ".
    This is most probably caused by the 'yscores', which end with '+0i'. Is there a possibility to solve this problem?


    Thank you very much!


    All the best,


    Simon

    ReplyDelete