## Wednesday, July 2, 2014

### Principal components rotation of ancestral states; or ancestral state reconstruction of PC scores?

I recently fielded the following query:

From what I understand, the phylomorphospace function allows the mapping of the phylogeny on a morphospace defined by 2 characters/traits. However, I'm using multivariate data from a morphometric analysis, and the shape variation is explained by a lot of axes. The first 2 PCs can approximate the shape variation, but my question is: how do you calculate ancestral nodes using all the PC scores, or all of your fourier coefficients if using this method? Is it possible? Do you calculate the nodes on the PCs you are interested in? Or do you calculate the nodes on all your variables and then project them in the desired shape space?

So, in essence, should we perform ancestral state reconstruction on our original characters and then rotate them using the PC loadings; or should we first run phylogenetic PCA and then perform ancestral state reconstruction on the PC scores for species?

Well, the answer is - it doesn't matter. We will obtain the same ancestral state reconstructions for our PCs regardless of our order of operations. This could no doubt be proved analytically, but it is also straightforward to demonstrate via simulation as follows.

First, simulate tree & data:

> ## stochastic pure-birth tree
> tree<-pbtree(n=26,tip.label=LETTERS)
> ## random covariance matrix
> L<-matrix(rnorm(16),4)
> L[upper.tri(L)]<-0
> V<-L%*%t(L)
> ## simulated correlated character evolution on the tree
> X<-sim.corrs(tree,vcv=V)

Now, we can do our two analyses - first compute ancestral states on the original data & rotate using the phylogenetic PCA loadings; and then rotate our tip data, and reconstruct ancestral states for the PC scores:

> ## perform phylogenetic PCA
> pca<-phyl.pca(tree,X)
> ## reconstruct ancestral states for original data
> A<-apply(X,2,fastAnc,tree=tree)
> ## rotate in PC space
> ## compute phylogenetic means for calculation
> M<-matrix(rep(A[1,],nrow(A)),nrow(A),ncol(A),byrow=TRUE)
> rotatedA<-(A-M)%*%pca\$Evec
> ## now compute ancestral states for PCA scores
> S<-apply(pca\$S,2,fastAnc,tree=tree)

Now we can compare them:

> plot(rotatedA,S)

That's it.