## Tuesday, October 1, 2013

### Computing the residuals for individuals from phylogenetic regression

A little while ago I received the following query:

"To account for body size I used the phylogenetic regression of your phyl.resid function. I used the mean value of each trait per species. However, now I would like to plot the residuals against the ecological variables to show how they relate, including standard deviations for trait means. Is there a way to calculate phylogenetically corrected residuals with standard deviations? Or simply calculate residuals for more than one individual per species?"

Well, there are multiple ways we could go about doing this - with multiple possible solutions. The simplest thing, and the one I'd probably recommend for starters, is to fit the phylogenetic regression to and , the species means vectors; but then compute = [1 x]β and e = y - from the original data vectors, x & y.

Here's how that would look if we had input data vectors xi and yi, both with the same order but in which names(xi) and names(yi) contain only species names (& thus no unique identifiers):

## first aggregate xi & yi to compute species means
xbar<-aggregate(xi,by=list(names(xi)),FUN=mean)
xbar<-setNames(xbar[,2],xbar[,1])
ybar<-aggregate(yi,by=list(names(yi)),FUN=mean)
ybar<-setNames(ybar[,2],ybar[,1])
## now fit PGLS model
fit<-gls(ybar~xbar,data.frame(xbar,ybar),correlation=
corBrownian(1,tree))
## compute yhat based on individual data in xi
yhat<-(cbind(1,xi)%*%fit\$coef)[,1]
## compute individualized residuals
ei<-yi-yhat

Note that the species-wise mean of ei will be exactly equal to our residuals from the fitted model using species means. E.g., compare:

ebar<-aggregate(ei,by=list(names(ei)),FUN=mean)
ebar<-setNames(ebar[,2],ebar[,1])
## with
residuals(fit)
They should be the same.

That's it!