In response to a recent R-sig-phylo thread I thought it might be useful to post code demonstrating how to compute the relative frequency of changes of particular types along edges in a posterior sample from the method of stochastic mapping.

```
## load libraries
library(phytools)
## here is our data
tree
```

```
##
## Phylogenetic tree with 26 tips and 25 internal nodes.
##
## Tip labels:
## A, B, C, D, E, F, ...
##
## Rooted; includes branch lengths.
```

```
x
```

```
## A B C D E F G H I J K L M N O P Q R
## "a" "a" "a" "b" "b" "b" "b" "b" "b" "a" "a" "a" "b" "a" "a" "a" "a" "a"
## S T U V W X Y Z
## "a" "a" "a" "b" "b" "b" "b" "b"
```

```
## obtain stochastic maps:
trees<-make.simmap(tree,x,model="ER",nsim=100)
```

```
## make.simmap is sampling character histories conditioned on the transition matrix
##
## Q =
## a b
## a -2.818303 1.409151
## b 1.409151 -2.818303
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
## a b
## 0.5 0.5
```

```
## Done.
```

Now let's do our analysis. The following function computes for a given edge
(by row in `tree$edge`

, but we could easily modify to be by
node number) the relative frequency of trees in which the state at the
start of the edge is `"a"`

and the state at the end is
`"b"`

.

```
## this function computes for a given edge
foo<-function(edge,trees,states){
obj<-sapply(trees,function(x,e,s)
if(names(x$maps[[e]])[1]==s[1]&&names(x$maps[[e]])[length(x$maps[[e]])]==s[2]) TRUE
else FALSE,e=edge,s=states)
sum(obj)/length(obj)
}
```

We can use it by iterating over edges:

```
edge.probs<-sapply(1:nrow(tree$edge),foo,trees=trees,states=c("a","b"))
edge.probs
```

```
## [1] 0.09 0.21 0.00 0.00 0.00 0.00 0.00 0.53 0.00 0.27 0.43 0.02 0.01 0.01
## [15] 0.00 0.00 0.00 0.00 0.01 0.00 0.01 0.01 0.00 0.00 0.02 0.00 0.97 0.00
## [29] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.13 0.05 0.00 0.00 0.00 0.00 0.00
## [43] 0.41 0.00 0.00 0.40 0.03 0.03 0.00 0.00
```

Finally, let's visualize the results on a 'representative' stochastic map:

```
plot(trees[[1]],colors=setNames(c("blue","red"),c("a","b")),
ylim=c(-2,Ntip(tree)))
par(fg="transparent") ## just for clarity
threshold<-0 ## threshold frequency above which we want to plot it
edgelabels(edge=which(edge.probs>threshold),
pie=edge.probs[which(edge.probs>threshold)],
piecol=c("red","black"),cex=0.8)
par(fg="black")
add.simmap.legend(colors=setNames(c("red","black"),c("freq. change from a to b",
"freq. no change or change from b to a")),prompt=FALSE,x=0,y=0)
```

Or we can change the threshold to `0.05`

:

```
plot(trees[[1]],colors=setNames(c("blue","red"),c("a","b")),
ylim=c(-2,Ntip(tree)))
par(fg="transparent") ## just for clarity
threshold<-0.05 ## threshold frequency above which we want to plot it
edgelabels(edge=which(edge.probs>threshold),
pie=edge.probs[which(edge.probs>threshold)],
piecol=c("red","black"),cex=0.8)
par(fg="black")
add.simmap.legend(colors=setNames(c("red","black"),c("freq. change from a to b",
"freq. no change or change from b to a")),prompt=FALSE,x=0,y=0)
```

That's about it.

BTW, for this example the data were simulated as follows:

```
tree<-pbtree(n=26,tip.label=LETTERS,scale=1)
Q<-matrix(c(-1,1,1,-1),2,2)/2
rownames(Q)<-colnames(Q)<-letters[1:2]
x<-sim.history(tree,Q,anc="a")$states
```

## No comments:

## Post a Comment