Wednesday, April 17, 2013

New version of estDiversity with multiple methods

I just posted a new version of the function estDiversity, which also adds a new method for computing the reconstructed diversity at each internal node. This function basically implements the method of Mahler et al. (2010) in which we first reconstructed historical biogeography of Greater Antillean Anolis - and then we estimated the standing 'island-wise' lineage density at each node, integrating over the islands on which the node might have been found.

Our method for doing this is relatively simple. We just reconstructed ancestral states first at each node, and then along each edge at the height of the node; and then we just added the probabilities from the latter & multiplied them by the probabilities of the former. So, for instance, if a given target node had a (empirical Bayesian) posterior probability of being on Puerto Rico of 0.5, and Hispaniola of 0.5; and a hypothetical competing lineage was on Puerto Rico with probability 0.5 & Hispaniola with probability 0.5 (for convenience of illustration), then our point estimate of the number of competing lineages for the first node is:

=0.5 × 0.5 + 0.5 × 0.5

in which subscripts t & c are target & competitor, respectively.

This makes perfect sense because the following four scenarios are expected with equal probability: (PRt , PRc), (PRt , Hc), (Ht , PRc), and (Ht , Hc) - and exactly 1/2 of the time our target node arises with one competitor. The same logic extends to multiple competing lineages, probabilities different from 0.5, & more than two islands.

The only identified issue with prior versions of estDiversity is that I had been using scaled conditional likelihoods for Pt instead of marginal reconstructed ancestral states; whereas for Pc I was (properly) using marginal reconstructions. (For more information on the distinction, search my blog.) I've realized that this was not right for some time, but had not heard any interest in the function so I hadn't gotten around to fixing it.

In addition to this, though, the new version of the function also gives the option of using stochastic maps to compute ancestral lineage density directly. It does this just by performing stochastic mapping using make.simmap, and then - at each internal node - counting the number of lineages at the height of the node with the same state as the node. For each stochastic map, each time a new node coexists with a competitor from mapped to the same island, we add 1/nsim. Otherwise, we add nothing.

We should probably actually prefer to this because this will account for the possibility (virtually guaranteed when nodes are close to each other in the tree) that the probabilities at a node and along competing edges are not independent.

Both methods are really slow, unfortunately. They seem to yield similar results:

> d.asr<-estDiversity(tree,x)
Please wait. . . . Warning - this may take a while!
Completed 10 nodes
Completed 20 nodes
Completed 30 nodes
Completed 40 nodes
> d.asr
       48         49         50         51         52
 0.0000000  0.8721167  1.8606261  3.2793402  4.7093087
        53         54         55         56         57
15.0590177  6.5464430  9.5738490 11.7358135  0.2020746
        58         59         60         61         62
 1.0892816  2.0400146  3.0361920  3.6605135  0.4269534
        63         64        ...
 1.2145138        ...
> d.maps<-estDiversity(tree,x,method="simulation")
Please wait. . . . Warning - this may take a while!
Completed 10 nodes
Completed 20 nodes
Completed 30 nodes
Completed 40 nodes
> > d.maps
   48    49    50    51    52    53    54    55    56    57
 0.00  0.99  2.67  3.64  6.28 16.08  9.10 12.02 14.04  0.22
   58    59    60    61    62    63    64    65   ...
 1.03  2.03  3.03  4.20  0.39  1.19  2.19   ...
> plot(d.asr,d.maps)

This new version depends on a number of other updates, so to test it (and please do!), I recommend downloading the newest phytools build (phytools 0.2-47), rather than trying to run the function from source.

1 comment: