Thursday, August 15, 2013

New version of ancThresh with multiple models for the evolution of liability

The phytools package has a couple of different models that implement comparative methods using the threshold model of Wright (1934) and introduced into phylogenetic comparative biology by Felsenstein (2005, 2012). One of these is ancThresh, which fits a multi-state threshold model & uses Bayesian MCMC to estimate ancestral states at internal nodes of the tree. (See here for more information.)

Well, at interesting idea that came up at the NESCent Evolutionary Quantitive Genetics workshop was using other models (such as the OU model) for the evolution of liability on the tree. OU is often used as a model for stabilizing selection. Obviously, in a strict threshold character, liabilities are "hidden" and thus selection should not, in theory, be able to operate directly on liability. Nonetheless, we thought that OU might still be a good model for the evolution of liability under some circumstances. Some example might include, for instance, circumstances where liability has a pleiotropic effect on other non-threshold characters that are under selection; or when liability has natural bounds that create a tendency to revert to an intermediate value (e.g., blood hormone level cannot increase or decrease indefinitely without bounds).

I have just posted a new version of ancThresh that allows the user to fit OU as well as BM. A big question in my mind was how well it would work - given the shortage of data about trait evolution that seems likely to be containing in a two or three state discretely valued trait.

This new version is in a new phytools build (phytools 0.3-33). Let's try it out:

> require(phytools)
Loading required package: phytools
> packageVersion("phytools")
[1] ‘0.3.33’
> require(geiger)
Loading required package: geiger
> ## simulate
> tree<-pbtree(n=100,scale=1)
> l<-fastBM(ouTree(tree,2),a=0.25,sig2=1,internal=TRUE)
> x<-sapply(l,threshState,setNames(c(0,0.5,Inf), LETTERS[1:3]))
> summary(as.factor(x))
A B C
66 100 33
> mcmc<-ancThresh(tree,x[1:100],ngen=200000,control= list(sample=100),model="OU")
**** NOTE: no sequence provided, using alphabetical or numerical order
MCMC starting....
gen 1000
gen 2000
...
gen 200000
> ## first let's see how it does estimating alpha
> burnin<-100000
> ii<-which(mcmc$par[,"gen"]==burnin)
> ps.alpha<-mcmc$par[ii:nrow(mcmc$par),"alpha"]
> mean(ps.alpha)
[1] 1.9842
> pd<-density(ps.alpha,bw=0.4)
> plot(pd,xlab="alpha",main="posterior density of alpha")
> lines(c(2,2),c(0,max(pd$y)),lty="dashed")
> ## now we can see how ancestral states were estimated
> plotTree(tree,setEnv=TRUE,ftype="off")
setEnv=TRUE is experimental. please be patient with bugs
> colors<-setNames(c("blue","green","red"),LETTERS[1:3])
> tiplabels(pie=to.matrix(x[1:100],LETTERS[1:3]),piecol= colors,cex=0.4)
> XX<-t(apply(mcmc$mcmc[ii:nrow(mcmc$mcmc),],2,function(x) summary(factor(x,levels=LETTERS[1:3]))/(nrow(mcmc$mcmc)-ii+1)))
> nodelabels(pie=XX,piecol=colors,cex=0.8)
> nodelabels(pie=to.matrix(x[1:tree$Nnode+100], LETTERS[1:3]),piecol=colors,cex=0.4)
> ## finally, let's see how well liabilities were estimated
> plot(l,colMeans(mcmc$liab[ii:nrow(mcmc$liab),]), xlab="true liability",ylab="estimated liability", main="estimated liability")
> lines(range(l),range(l),lty="dashed")

So, somewhat surprisingly, with as little as three discrete character states we are doing quite well (at least in this instance) of estimated α in the OU model; and we are fairly good at reconstructing ancestral states & liabilities.

That's pretty cool.

2 comments: