Phylogenetic Tools for Comparative Biology: Ancestral character estimation under the threshold model, part II

Sunday, October 14, 2012

Ancestral character estimation under the threshold model, part II

Another quick post about ancestral character estimation under the threshold model from quantitative genetics. For a reminder of the details of this model, please check out my prior posts on the topic, as well as Felsenstein (2012).

The central idea is that this model, in which the value for a discrete character is determined by the evolution of an underlying continuous trait called 'liability,' might be a more appropriate model for many discretely manifested, by complex or polygenic, attributes of organismal phenotype.

I'm estimating under this model by using Bayesian MCMC to sample ancestral and tip liabilities, and the thresholds that result in a change in the discrete character, from their joint posterior probability distribution.

I'm still working on this, but here is an example simulation, analysis, and then some assessments of performance:

> source("ancThresh.R")
> # simulate liabilities
> l<-fastBM(tree,internal=TRUE,sig2=2,a=-1)
> # translate liabilities to threshold character
> x<-rep("blue",length(l))
> x[l>0]<-"green"
> x[l>1]<-"red"
> x[l>6]<-"yellow" > names(x)<-names(l)
> x
t54 t55 t51 ...
"blue" "blue" "blue" ...
> x11(); res<-ancThresh(tree,x[1:length(tree$tip)],ngen=100000)
**** NOTE: no sequence provided, using alphabetical or numerical order
MCMC starting....
gen 1000
gen 2000
...

> # plot the likelihood profile
> plot(res$par[,"gen"],res$par[,"logLik"],xlab="gen", ylab="logL")
> lines(res$par[,"gen"],res$par[,"logLik"])

> # pull estimates as max PP for each node
> est<-apply(res$ace,1,function(x) names(x)[which(x==max(x))])
> est
101 102 103 ...
"red" "blue" "blue" ...
> # do they match?
> a<-x[101:199] # get simulated ancestors
> sum(a==est)/length(a)
[1] 0.7878788 # about 79% match
> # what if we restrict to >0.5, 0.7, or 0.9 PP?
> any50<-apply(res$ace,1,function(x) any(x>0.5))
> sum(any50)/tree$Nnode # gives us the % nodes with PP>0.5
[1] 0.969697
> # now the fraction that match, given PP>0.5
> sum(a[any50]==est[any50])/sum(any50)
[1] 0.7916667
> # PP > 0.7
> any70<-apply(res$ace,1,function(x) any(x>0.7))
> sum(any70)/tree$Nnode
[1] 0.5050505
> sum(a[any70]==est[any70])/sum(any70)
[1] 0.96
> # 96% of nodes with PP>0.7 are correct
> # PP > 0.9
> any90<-apply(res$ace,1,function(x) any(x>0.9))
> sum(any90)/tree$Nnode
[1] 0.2929293
> sum(a[any90]==est[any90])/sum(any90)
[1] 1

Cool. I will post the code & more on this very soon.

2 comments:

ConostegiaOctober 25, 2012 at 7:14 AM
Dear Liam,

Is there a way to run threshBayes with a single discrete character?

Also, I am relatively new to R so when I run threshBayes with two characters I do not know how to plot the results onto the phylogeny as you did above.

I would be very greatful for any tips!

Ricardo
ReplyDelete
Replies

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.

Pages

Sunday, October 14, 2012

Ancestral character estimation under the threshold model, part II

2 comments: