Monday, August 29, 2011

New version of evol.vcv()

Ok, I just posted a new version of evol.vcv() - direct link to code here. The main new feature of this version (v0.2) is that it also computes the variances of the parameter estimates. (The standard errors can then just be computed by taking the square roots.) These variances come from the matrix of partial second derivatives of the likelihood surface at the optimum (i.e., the curvature of the surface), also known as the Hessian matrix. The intuitive relationship between the curvature of the likelihood surface and the uncertainty of our parameter estimates is quite straightforward: if the likelihood surface is very flat (low curvature), then the uncertainty in our parameter estimates is large; conversely if the likelihood surface is very peaked at the optimum (i.e., high curvature), then this implies that uncertainty in our parameter estimates is low.

This function (that is, evol.vcv()) implements the method of Revell & Collar (2009) and was also described in a prior blog post.

In R, normally obtaining the Hessian matrix is easy because the built-in optimizer, optim(), optionally returns its value. Unfortunately, as I outlined in a previous post, this would not do for evol.vcv() because in this function I actually optimized the likelihood with respect to the Cholesky matrices rather than the evolutionary variance-covariances themselves. This was done for computational reasons because the evolutionary variance-covariance matrices (called "rate matrices" in our paper) are subject to annoying constraints which mean that for most random values the likelihood cannot be computed.

The solution, which I figured out yesterday and then implemented today, was to optimize with respect to the Cholesky matrices (as before) - but then differentiate the likelihood surface for the full VCV matrices at the optimum using hessian() in the "numDeriv" package (CRAN page here). This seems to work - and this analysis is implemented in v0.2 of evol.vcv() on my R phylogenetics page. This will also be included in the next version of "phytools".

The variances are not computed by default. To get them, load the present version from source:

> source("http://faculty.umb.edu/liam.revell/phytools/evol.vcv/v0.2/evol.vcv.R") # this should work
> result<-evol.vcv(tree,X,vars=TRUE)

for SIMMAP style tree and data matrix X. Good luck, and please report if this function seems to work properly or not.

1. Note that it is also necessary to install the "numDeriv" library.

2. Liam, It works! This is a nice addition to the function.

3. Thanks Dave. I still want to subject this to more testing - just to make sure that it is doing what it is supposed to be doing - but thanks for the report!

4. Hey Liam, I am running this function for a set of stochastic character histories obtained from SIMMAP, and I occasionally get negative variances for the parameter estimates. I am fitting one and two rate matrix models for three traits and two phylogenetic partitions. Any idea why this is happening?

5. Hi Dave. Thanks for sending me your data & tree for this.

I have been working on this all morning & there is no clear "bug" (in the traditional sense), in that everything seems to be computing correctly. However, the negative variance implies positive curvature of the likelihood surface at the optimum. (This is concerning since at the optimum curvature should be negative.)

In your example, it is possible that the curvature is just very flat for the parameter in question and thus the optimizer effectively ends up in a very slightly positively curved location on the likelihood surface in that dimension. I will try different starting values to suss this out.

- Liam

6. I have noticed an association between negative variances and unusually large magnitude parameter estimates. (By unusually large, I mean relative to estimates for that parameter on trees/stochastic character maps). Could a flat likelihood curve also explain these extreme parameter values?

I should also note, though, that for each case, evol.vcv tells me that the "Optimization has converged."

7. Hi Dave. That sounds like a good guess. I will try different starting values and see if the optimization "converges" to a different optimum. - Liam

8. This might be a silly question, but under the 1 vcv matrix case what exactly is being optimized? When going through the code it seems that the vcv matrix is calculated in the "Compute the MLE of R" step. I can't seem to compute the log-likelihood (from the cholesky matrices) because p <- dim(C)[3] gives NA, in my case C is a 5X5 matrix.