Friday, November 10, 2017

New methods for Bayesian comparative analysis with within-species variation (fitBayes in phytools)

Several years ago Graham Reynolds and I published a new Bayesian method for fitting comparative models while taking intraspecific variation into account (Revell & Reynolds, 2012). In our method the input data are observations for individuals and then the method samples the species means, within-species variances, and the parameters of the evolutionary model from their joint posterior probability distribution. We implemented two different models: the standard Brownian model; and Pagel's λ model - although in principle the approach could be applied to basically any evolutionary model.

Unfortunately, although I thought it was quite creative at the time, it doesn't seem that the method has seen a lot of use. It nonetheless has long been available in the phytools package function fitBayes. Today, as I have been making a lot of other phytools updates, I also added nice print & plot methods to this function. In so doing, I had to change the object returned by the function - and not merely by adding a class attribute. So far as I can figure out no other package imports fitBayes so I'm hopefully that so doing will not mess up anyone else's code!

Here's a demo using a simulated case.

First, we can see that the data consists of a vector containing more than one individual per species (and thus duplicate names):

library(phytools)
tree
## 
## Phylogenetic tree with 80 tips and 79 internal nodes.
## 
## Tip labels:
##  t53, t54, t46, t47, t16, t39, ...
## 
## Rooted; includes branch lengths.
x
##         t53         t53         t53         t54         t54         t46 
## -1.36584559 -1.75702668 -1.49500879  0.02734140 -0.79126533 -1.27787349 
##         t46         t46         t47         t47         t16         t16 
## -1.63292401 -0.73905774  0.28722121  0.56724541 -1.81926475 -2.47574179 
##         t39         t69         t69         t70         t70         t25 
## -3.32653584 -1.11311653 -1.83920781 -1.13687208 -1.11557657 -1.87615966 
##         t26         t26         t42         t51         t51         t51 
##  0.54964087  0.69032594  0.08008442 -0.79971801 -0.23063656 -1.52571729 
##         t52         t61         t61         t61         t62         t62 
## -1.88273303  0.83037441 -0.26580903  0.48519237  0.24630469 -0.10446499 
##         t12         t12         t12          t8         t22         t23 
## -2.21051983 -2.01120769 -1.13601674 -1.92495539 -0.38584932 -2.99595120 
##         t23         t23         t17         t17         t15         t35 
## -2.88622929 -4.05081746 -3.07593710 -3.41281774  2.01843387 -0.07740489 
##         t36         t13         t18         t18         t19         t19 
## -0.25982634  0.71885283  0.38719086  0.21597236 -0.88696379 -1.73680934 
##         t19          t1          t1          t1         t57         t57 
## -1.49365966 -0.91797001 -2.42000850 -2.06134226 -1.79894241 -1.82472322 
##         t58         t55         t55          t9         t10         t10 
##  0.05877151  0.76626992 -0.28887442 -1.46490403 -0.47219582 -1.99761496 
##         t41         t41         t41         t71         t71         t71 
## -1.06168695  0.44052412 -0.82993642  0.49956333 -0.47575522 -1.07867418 
##         t72         t29         t29         t29          t7          t7 
## -0.91618697 -1.50467425 -0.61245609 -0.92505143 -2.84765531 -1.84259199 
##          t7         t24         t24         t24         t77         t78 
## -1.07225521 -1.88147239 -1.72905622 -1.25461136 -1.89875400 -0.69589218 
##         t78         t56         t48         t48         t49         t49 
## -0.23529659 -0.14733604 -3.06802324 -3.69867859  0.82120645  0.20412792 
##         t49         t14         t14         t14         t79         t79 
##  0.49970079 -0.30600076 -0.42009051 -0.01485346 -2.39829923 -2.35296242 
##         t79         t80         t11         t11          t2         t40 
## -3.34863939 -2.05243484 -2.92288436 -1.94628844  0.24047944 -0.14631321 
##         t40         t50         t50         t63         t63         t64 
## -0.91321826 -0.90234613 -0.97583920 -1.07853744 -2.14910943 -2.65840596 
##         t64         t64         t27         t27         t28         t30 
## -2.68872053 -2.26340103  1.28033842  0.48223057 -1.02480630 -0.77859482 
##         t30         t31         t31         t59         t59         t59 
## -0.95493748  1.38078518  0.66078132  1.88896094  2.39002261  1.79299996 
##         t60         t60         t65         t65         t66         t75 
## -0.39027273 -0.95939279 -0.19917796  0.01410087 -1.73348047  0.94586597 
##         t75         t76         t20         t20         t21         t21 
##  1.37552716  1.91131653 -2.43740326 -1.31988357  0.84776277  1.25367141 
##         t37         t37         t37         t38         t38         t38 
## -1.87392354 -1.11631045 -2.07462260 -1.47891466 -2.13206798 -2.41424014 
##          t6          t4          t4          t4          t5          t5 
## -3.00514922 -2.51146433 -2.11978749 -1.82213684 -1.24949167 -1.38174359 
##          t5         t73         t73         t73         t74         t74 
## -1.37899029 -0.45279841  0.01804964  0.03556811 -0.94337893 -0.33991667 
##         t45         t45         t45          t3          t3         t33 
## -0.97676032 -1.49846820 -0.96005447  0.37642114  0.74953120 -3.22641999 
##         t67         t67         t67         t68         t68         t34 
## -1.29478388 -1.72266003 -1.26569638 -4.13732591 -3.48843187 -2.07107262 
##         t34         t34         t32         t32         t32         t43 
## -0.96212798 -3.22461039 -2.79090429 -2.88259079 -2.52733125 -1.48711649 
##         t44 
## -0.54971583

Now let's go ahead & run our MCMC. I'm going to run the λ model.

obj<-fitBayes(tree,x,ngen=1000000,model="lambda")
## Control parameters (set by user or default):
## List of 9
##  $ sig2   : num 12.7
##  $ lambda : num 1
##  $ a      : num -0.99
##  $ xbar   : Named num [1:80] -1.539 -0.382 -1.217 0.427 -2.148 ...
##   ..- attr(*, "names")= chr [1:80] "t53" "t54" "t46" "t47" ...
##  $ intV   : num 0.252
##  $ pr.mean: num [1:84] 1000 0.502 0 0 0 ...
##  $ pr.var : num [1:84] 1e+06 1e+00 1e+03 1e+03 1e+03 ...
##  $ prop   : num [1:84] 0.127 0.02 0.127 0.127 0.127 ...
##  $ sample : num 100
## Starting MCMC...
## Done MCMC.

Note that the control parameters of the MCMC can be set by the user, but absent that the function will try to identify sensible values from the data.

Let's look at our object:

obj
## 
## Object of class "fitBayes" which consists of a posterior sample
## (in component 'mcmc') from a Bayesian MCMC of the model presented
## in Revell & Reynolds (2012; Evolution).
## 
## Object summary:
##  generations of MCMC: 1e+06.
##  sample interval: 100.
##  mean sigma^2 from posterior sample: 2.047457.
##  mean lambda from posterior sample: 0.43923.
## 
## Calculations based on burn-in of 2e+05 generations.
plot(obj,what="sig2")

plot of chunk unnamed-chunk-3

plot(obj,what="lambda")

plot of chunk unnamed-chunk-3

Neat. In our case the generating parameter values were σ2 = 2.0 and λ = 0.5. (Our posterior distribution of λ is quite broad.)

In fact, the data & tree were simulated as follows:

library(phytools)
tree<-pbtree(n=80,scale=1)
xbar<-fastBM(phytools:::lambdaTree(tree,0.5),
    sig2=2.0) # true species means
x<-vector()
for(i in 1:Ntip(tree)){
   n<-floor(runif(n=1,min=1,max=4))
   y<-rnorm(n=n,mean=xbar[i],sd=sqrt(0.2))
   names(y)<-rep(names(xbar)[i],length(y))
   x<-c(x,y)
}

That's it.

No comments:

Post a Comment