Thursday, April 18, 2013

Bug fix in new version of estDiversity for method="asr"

After building a new version of the function estDiversity, which estimates the lineage diversity at each internal node using a method based on (but modified from) Mahler et al. (2010), I realized today that there was a bug for estDiversity(...,method="asr"). Basically the lines:

tr<-reroot(tree,node.number=ee[j],position=tt-hh[j])
D[i,]<-D[i,]+apeAce(tr,x[tree$tip.label],model= model)$lik.anc[1,]
should have read:
tr<-reroot(tree,node.number=ee[j],position=tt-hh[j])
D[i,]<-D[i,]+apeAce(tr,x[tr$tip.label],model= model)$lik.anc[1,]
This is now fixed, but I have also modified the internally called function apeAce (a function for computing the conditional likelihoods which is modified from code in 'ape') so that it no longer requires that the phenotypic vector is in the order of tree$tip.label.

Yesterday I also made a little bit of hay over the fact that the two methods - one based on empirical Bayesian marginal ancestral state reconstruction at nodes & along edges in the tree; and the second based on stochastic mapping - could return different results. Well, it turns out I'm full of $h!t. At least in the one empirical example I've been running to test the code (from Luke Mahler) that discrepancy is entirely due to the bug. Here's what I mean:

> require(phytools)
Loading required package: phytools
> packageVersion("phytools")
[1] ‘0.2.48’
> d.asr<-estDiversity(tree,x)
Please wait. . . . Warning - this may take a while!
Completed 10 nodes
Completed 20 nodes
Completed 30 nodes
Completed 40 nodes
> d.sim<-estDiversity(tree,x,method="simulation")
Please wait. . . . Warning - this may take a while!
Completed 10 nodes
Completed 20 nodes
Completed 30 nodes
Completed 40 nodes
> plot(d.asr,d.sim)

This is a little disappointing as I thought that this might have been a nice illustration of the difference between marginal & joint reconstructions - turns out it is simply due to a bug! This distinction may still be important for some datasets - just not this one under the model we have chosen.

Since I was updating this function anyway, I now give user control of the model of evolution for the discrete character to the user; however be warned that for method="asr" only symmetric models are allowed. Non-symmetric models will be changed to method="ER". (This is not true of estDiversity(...,method="simulation") which should work for both symmetric and non-symmetric models.)

The new build of phytools (phytools 0.2-48) containing this updates is here.

2 comments:

  1. Thanks for all of that Liam! Looks to work well for the anole data set from our 2010 paper. If folks are interested, here's a short function for plotting pie charts representing reconstructed geographies for the internal nodes of a phylogeny. It requires that the package "mapplots" is installed:

    require(mapplots)
    plotNodePie = function(xvals,yvals,pie,pie.colors,radius=radius,...){
    plotPie = function(temp,pie.colors){
    add.pie(x=temp["xvals"],y=temp["yvals"],z=temp[-c(1:2)],radius=radius,col=pie.colors,labels=NA,init.angle=0)
    }
    plot(xvals,yvals,type="n",...)
    temp.matrix = cbind(xvals,yvals,pie)
    apply(temp.matrix,MARGIN=1,plotPie,pie.colors=pie.colors)
    return("done")
    }

    You can use this function to recreate many of the figures from our 2010 Evolution paper. The following script gives an example for the plot that illustrates how lineage diversity estimates for nodes vary over time, as anoles colonize different islands and radiate (different islands have different colors). This script assumes we've already read in a time-calibrated tree (GA_Anolis), and that we have a vector indicating the geographic location of each species (geography).

    ## First estimate ancestral area for each node.
    recon <- rerootingMethod(tree= GA_Anolis,x=geography,model="ER")
    for.piecharts <- recon$marginal.anc

    ## Calculate ages of all nodes in the tree.
    time.vector <- max(branching.times(GA_Anolis))-branching.times(GA_Anolis)

    ## Estimate lineage diversities of all nodes in the tree.
    diversity.vector <- estDiversity(GA_Anolis,x=geography,method="asr")

    ## Make a vector of colors for the areas (in the example, we've got 4 islands).
    piecolors <- c("firebrick3","gold2","chartreuse4","royalblue3")

    ## Finally, make a plot of lineage diversity estimates over time for all nodes.
    plotNodePie(xvals= time.vector,yvals= diversity.vector,pie=for.piecharts,pie.colors=piecolors,xlab="Time From Root",ylab="Lineage Diversity Estimate",radius=0.015)


    I can't show the plot in a comment, but you can see it here.
    Cuba is in red, Hispaniola yellow, Puerto Rico blue, and Jamaica green.

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.