Phylogenetic Tools for Comparative Biology: Using markChanges to show the distribution of implied changes on a tree from stochastic mapping

Tuesday, May 17, 2016

Using `markChanges` to show the distribution of implied changes on a tree from stochastic mapping

phytools has a function called markChanges that can be used to highly changes of different types on the tree. Various uses of this function are documented here; however they focus on marking the changes from a single stochastic map history. Alternatively, we might like to look at the distribution of changes across a sample of stochastic maps.

This turns out to be really easy to do.

So, let's start with the following tree & data:

library(phytools)
tree

## 
## Phylogenetic tree with 31 tips and 30 internal nodes.
## 
## Tip labels:
##  X1, A, B, C, D, E, ...
## 
## Rooted; includes branch lengths.

##  X1   A   B   C   D   E  X2  X3   F   G  X4   H   I   J   K   L   M   N 
## "a" "b" "b" "b" "b" "b" "a" "a" "b" "b" "b" "a" "a" "a" "a" "a" "a" "a" 
##   O   P  X5   Q   R   S   T   U   V   W   X   Y   Z 
## "a" "a" "a" "a" "b" "b" "a" "a" "a" "a" "a" "a" "a"

colors<-setNames(c("blue","red"),c("a","b"))
dotTree(tree,x,colors=colors)

plot of chunk unnamed-chunk-1

We can do stochastic mapping as follows:

## first let's pick a model
fitER<-fitMk(tree,x,model="ER")
fitER

## Object of class "fitMk".
## 
## Fitted (or set) value of Q:
##           a         b
## a -0.027553  0.027553
## b  0.027553 -0.027553
## 
## Fitted (or set) value of pi:
##   a   b 
## 0.5 0.5 
## 
## Log-likelihood: -10.956725

fitARD<-fitMk(tree,x,model="ER")
fitARD

## Object of class "fitMk".
## 
## Fitted (or set) value of Q:
##           a         b
## a -0.027553  0.027553
## b  0.027553 -0.027553
## 
## Fitted (or set) value of pi:
##   a   b 
## 0.5 0.5 
## 
## Log-likelihood: -10.956725

LR<--2*(logLik(fitER)-logLik(fitARD))
P<-pchisq(LR,df=1,lower.tail=FALSE)
P ## we cannot reject "ER"

## [1] 1

trees<-make.simmap(tree,x,nsim=100,model="ER")

## make.simmap is sampling character histories conditioned on the transition matrix
## 
## Q =
##             a           b
## a -0.02755304  0.02755304
## b  0.02755304 -0.02755304
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
##   a   b 
## 0.5 0.5

## Done.

trees

## 100 phylogenetic trees with mapped discrete characters

Now, let's make our plot. Unlike in the example referenced above, here I'm going to overlay the changes on on a dotTree plot, instead of on any single stochastic map. To make the density of the marked changes a little bit easier to read, I'm going to make them semi-transparent with the aide of the phytools internal function make.transparent (which I borrowed from something Josef Uyeda shared with me a long time ago):

dotTree(tree,x,colors=colors)
mt<-phytools:::make.transparent ## using the internal function 
nulo<-sapply(trees,markChanges,sapply(colors,mt,0.4))
add.simmap.legend(colors=sapply(setNames(colors[2:1],c("a->b","b->a")),
    mt,0.4),prompt=FALSE,x=0.5,y=30,vertical=TRUE)

plot of chunk unnamed-chunk-3

If you wanted to go crazy, you could also add the posterior probabilities at nodes as follows:

obj<-summary(trees)
obj

## 100 trees with a mapped discrete character with states:
##  a, b 
## 
## trees have 5.1 changes between states on average
## 
## changes are of the following types:
##       a,b  b,a
## x->y 4.25 0.85
## 
## mean total time spent in each state is:
##                a          b    total
## raw  141.2496391 39.5391057 180.7887
## prop   0.7812966  0.2187034   1.0000

dotTree(tree,x,colors=colors)
nulo<-sapply(trees,markChanges,sapply(colors,mt,0.4))
add.simmap.legend(colors=sapply(setNames(colors[2:1],c("a->b","b->a")),
    mt,0.4),prompt=FALSE,x=0.5,y=30,vertical=TRUE)
nodelabels(pie=obj$ace,cex=0.6,piecol=colors)

plot of chunk unnamed-chunk-4

Although that could be a bit excessive.

In any case, it gives a good idea of where on the tree changes in state most likely occured. For instance, it is very interesting to note that changes a->b most likely occurred independently on the lineages leading to X4 and (G,F), rather than once in their common ancestor followed by a reversion b->a.

That's all.

Phylogenetic Tools for Comparative Biology

Pages

Tuesday, May 17, 2016

Using `markChanges` to show the distribution of implied changes on a tree from stochastic mapping

No comments:

Post a Comment