Note that equation (6) of Rohlf (2001) only gives the relative variance on the ancestral state estimate at the root node. To scale that estimate to our data, we need to multiply by the phylogenetic variance for our continuous trait. This can be computed as the mean square of the contrasts. Once we have the variances, we can compute our 95% CIs on the estimates as the estimates +/- 1.96 × the square root of the variances.
I didn't realize this when I was writing the function, but it turns out to be the case that this update to fastAnc depends on ape >= 3.0-7 (i.e., the newest version of ape as of the date of writing). This is because the options in the ape function for independent contrasts, pic, were expanded in the latest release to include the option of returning the tree with branches scaled to expected variance - which we can conveniently exploit to do the calculation of equation (6) in Rohlf.
I should also point out that the 95% CIs obtained by this function differ in a substantial way from the 95% CIs computed in ace. Specifically, the 95% CIs computed in ace would seem to be too small. We can show this relatively easily by simulation, as follows:
> onCI.ace<-onCI.fastAnc<-vector()
> N<-100
> for(i in 1:1000){
+ tree<-pbtree(n=N)
+ x<-fastBM(tree,internal=TRUE)
+ a<-fastAnc(tree,x[1:N],vars=TRUE,CI=TRUE)
+ onCI.fastAnc[i]<-sum((x[1:tree$Nnode+N]>a$CI95[,1])*(x[1:tree$Nnode+N]< a$CI95[,2]))/tree$Nnode
+ b<-ace(x[1:N],tree,CI=TRUE)
+ onCI.ace[i]<-sum((x[1:tree$Nnode+N]>b$CI95[,1])*(x[1:tree$Nnode+N]< b$CI95[,2]))/tree$Nnode
+ }
There were 24 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In sqrt(diag(solve(h))) : NaNs produced
2: In sqrt(diag(solve(h))) : NaNs produced
3: ...
> # this should be 0.95
> mean(onCI.fastAnc)
[1] 0.9483737
> mean(onCI.ace,na.rm=TRUE)
[1] 0.6738759
> N<-100
> for(i in 1:1000){
+ tree<-pbtree(n=N)
+ x<-fastBM(tree,internal=TRUE)
+ a<-fastAnc(tree,x[1:N],vars=TRUE,CI=TRUE)
+ onCI.fastAnc[i]<-sum((x[1:tree$Nnode+N]>a$CI95[,1])*(x[1:tree$Nnode+N]< a$CI95[,2]))/tree$Nnode
+ b<-ace(x[1:N],tree,CI=TRUE)
+ onCI.ace[i]<-sum((x[1:tree$Nnode+N]>b$CI95[,1])*(x[1:tree$Nnode+N]< b$CI95[,2]))/tree$Nnode
+ }
There were 24 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In sqrt(diag(solve(h))) : NaNs produced
2: In sqrt(diag(solve(h))) : NaNs produced
3: ...
> # this should be 0.95
> mean(onCI.fastAnc)
[1] 0.9483737
> mean(onCI.ace,na.rm=TRUE)
[1] 0.6738759
This simulation shows that although our 95% CIs computed in fastAnc include the generating values almost exactly 95% of the time (94.8% across 1000 simulations with 99 estimated ancestral states per simulation), ace CIs only include the generating value about 67% of the time.
I'm not exactly sure why this is the case, but my best guess is based on the warnings which tell us that the Hessian is being used to compute the standard errors of the parameter estimates and thus the CIs. This is an asymptotic property of the likelihood surface, and for finite sample this approximation can be quite bad (as we see above).
It's been a few years since I last read Rohlf (2001) but doesn't he mention how the CIs calculated by Schluter et al. are underestimated? I remember noticing once that ace's CIs were too small and just thinking it maybe was related to that.
ReplyDeleteAnyway, thanks for finally implementing the Rohlf CIs in R! I tried to do it myself (back when I last read that paper) but was never certain I did all the matrix math right...
Yes - you remember correctly. From p. 2148:
Delete"The maximum-likelihood estimation method of Schluter et al. (1997) yields standard errors similar to those produced by Martins and Hansen’s (1997) method. However, Martins’s (1999) simulation study demonstrates that there is a serious problem with both of these approaches because they yield estimates of the standard errors that are usually much too small."
Thanks for pointing this out.
Hi Liam,
ReplyDeleteI realize that this post is over a year old. But if possible, could you please elaborate on your point in the last paragraph
"This is an asymptotic property of the likelihood surface, and for finite sample this approximation can be quite bad (as we see above)."
I am having a similar problem in some analysis using sna::lnam function. I was hoping more explanation might aid me in solving some of the issues I've been having with the results.
Again, any advice you could give will be appreciated,
Sam