Monday, May 13, 2013

New version of rerootingMethod & estimating ancestral states with an ordered model

I just posted a new version of rerootingMethod. This function computes the empirical Bayes posterior probabilities (marginal ancestral state reconstructions) for a discrete character using the re-rooting method of Yang (e.g., Yang 2006). The main functional change in this version is that now the function can take a model index matrix (i.e., a matrix with the same dimensions as Q with integers to specify the different elements to be estimated), instead of just the string "ER" or "SYM". This should allow users to specify any arbitrary symmetric model - not just the full symmetric or equal-rates model.

Our index matrix contains an integer for each rate to be estimated - but values set to zero indicate that the corresponding transition rate should be zero. This caused a little bit of a problem for rerootingMethod because it had the code:

Q<-matrix(YY$rates[YY$index.matrix],ncol(XX),ncol(XX),
 dimnames=list(colnames(XX),colnames(XX)))

This code first creates a vector with the elements of the index matrix; and then coerces that vector into a matrix with dimensions specified by our number of states for the character. The problem is that index matrix elements that are zero will cause the length of the vector to be wrong, so that it cannot be coerced into a matrix with the right dimensions (we're kind of lucky that this is the case - or we might have missed the error). To solve this, I just did:

Q<-matrix(c(0,YY$rates)[YY$index.matrix+1],ncol(XX),
 ncol(XX),dimnames=list(colnames(XX),colnames(XX)))

One of the most common symmetric models other than the equal rates model that we might be interested in is probably the ordered model. This is a model in which we, say, allow transition ABCD, but no other types of transitions are allowed. How can we fit this model? Here's a quick demo in which I simulate under the model & then fit it & estimate ancestral states:

> require(phytools)
Loading required package: phytools
> packageVersion("phytools")
[1] ‘0.2.63’
> tree<-pbtree(n=100,scale=2)
> Q<-matrix(c(-1,1,0,0,1,-2,1,0,0,1,-2,1,0,0,1,-1),4,4)
> colnames(Q)<-rownames(Q)<-LETTERS[1:4]
> Q
   A  B  C  D
A -1  1  0  0
B  1 -2  1  0
C  0  1 -2  1
D  0  0  1 -1
> x<-sim.history(tree,Q)$states
> mm<-Q; diag(mm)<-0
> mm
  A B C D
A 0 1 0 0
B 1 0 1 0
C 0 1 0 1
D 0 0 1 0
> aa<-rerootingMethod(tree,x,model=mm)
> plot(tree,no.margin=TRUE,edge.width=2, show.tip.label=FALSE)
> nodelabels(pie=aa$marginal.anc,piecol=palette()[1:4], cex=0.5)
> tiplabels(pie=to.matrix(x,colnames(aa$marginal.anc)), piecol=palette()[1:4],cex=0.3)

Code for the new version of rerootingMethod is here; and the latest phytools build can be downloaded here.

1 comment:

  1. Hi Liam,
    I contacted you by mail, but the blog may be a nice channel for ideas exchange as well, I believe.

    I'm writting some code to actually simulate geographic dynamics using this CTMC framework. I implemented my code using geiger, but now that I've delved into phytools documentation and also this blog, I'm quite tempted to re-do my stuff to use phytools capabilities. I like the way you can do it all (simulate, fit and estimate) in a very clean and fast way. Since I'm planning a serious simulation study, this seems rather fit.

    Do you think it is possible to incorporate an asymmetric Q, provided the restrictions for detailed balance (upon exponentiation) hold?

    Cheers and thanks for providing,

    Luiz

    ReplyDelete