Saturday, December 28, 2013

New method to locate one or multiple rate shifts on a tree using likelihood

I just posted a new phytools function, rateshift, that fits a model in which there are one or multiple Brownian rate shifts in the tree at different heights above the root. The idea is that we don't need to specify the locations of the rate shifts a priori (as we can already do using brownie.lite); rather, we let the data determine where the rate shifts are located. Turns out that this isn't too hard, it just requires p-1 additional parameters for each extra rate above the root.

It also occurred to me that it's entirely possible this method is already in the literature - in which case, please accept my apologies for having missed it!

In the simplest case, this would just be a model with two evolutionary rates: σ2(1) rootward of the shift, and σ2(2) tipward; and one rate shift. We then jointly maximize the likelihood of the rates & position of the rate shift.

Here's a quick demo:

> require(phytools)
Loading required package: phytools
Loading required package: ape
Loading required package: maps
Loading required package: rgl
> packageVersion("phytools")
[1] ‘0.3.85’
> tree<-pbtree(n=100,scale=1)
> tree<-make.era.map(tree,c(0,0.5,0.8))
> x<-sim.rates(tree,c(1,10,1))
names absent from sig2: assuming same order as $mapped.edge
> fit<-rateshift(tree,x,nrates=3,print=TRUE,plot=TRUE, tol=1e-5)
Optimization progress:

s^2(1)  s^2(2)  s^2(3)  shift:1 shift:2 logL
2.623   2.623   2.623   0.3333  0.6667  -117.6615
2.624   2.623   2.623   0.3333  0.6667  -117.6616
....
2.1182  7.6728  0.7129  0.5497  0.8235  -100.7116
2.1182  7.6728  0.7109  0.5497  0.8235  -100.7163
2.1182  7.6728  0.7119  0.5507  0.8235  -100.7079
....
1.1796  8.6499  0.7848  0.5886  0.8116  -100.1251
1.1796  8.6499  0.7848  0.5896  0.8126  -100.1175
1.1796  8.6499  0.7848  0.5896  0.8106  -100.1275

> fit
ML 3-rate model:
      s^2(1) se(1)  s^2(2) se(2)  s^2(3) se(3)  k  logL
value 1.1796 0.991  8.6499 2.3647 0.7848 0.1709 6  -100.12

Shift point(s) between regimes (height above root):
        1|2     se(1|2) 2|3     se(2|3)
value   0.5896  0.0173  0.8126  0.0173

R thinks it has found the ML solution.

The options plot & print slow down runtime; however I included them for the purposes of debugging and it is kind of neat to visualize the optimization of the locations of the rate shift.

The implementation is a little buggy - however at least in this example it seems to do pretty well at finding our generating rate shift points (which were, remember, 0.5 & 0.8 units above the root), and rates (1, 10, & 1). Cool!

This function is in a new phytools build (phytools 0.3-85), which can be downloaded & installed from source. Please check it out.

No comments:

Post a Comment