Today I received the following question about stochastic character mapped trees in phytools:
I'm working on a data set where I would like to infer the absolute
date associated with a transition from one character state to another on
a phylogeny (in my case its a habitat transition in a group of fishes).
I am using SIMMAP to map the binary traits onto a subset of trees from a
BEAST analysis…. Basically, I would be happy with simply extracting absolute
time of each character transition from each SIMMAP tree. For example, on
each tree, I just want to be able to extract that transitions happened at
10.1 mya, 12.5 mya, 18.8 mya, etc…. I just cannot figure out a
straight-forward way to do this for 100-1000 trees and I was wondering if
there was a function in phytools that was designed for just this sort of
thing.
Well, there are several functions in phytools, such as
contSimmap
and
describe.simmap
that, in addition to
densityMap
,
can be used to summarize the results of stochastic mapping in phytools;
however none contain the precise function that is being sought after here.
That said, it should not be too hard.
Let's start by seeing how it can be done on a single history. I'll keep
it simple, and simulate a tree with three states: "a"
,
"b"
, and "c"
:
library(phytools)
Q<-matrix(c(-1,1,0,1,-2,1,0,1,-1),3,3)
colnames(Q)<-rownames(Q)<-letters[1:3]
Q
## a b c
## a -1 1 0
## b 1 -2 1
## c 0 1 -1
tree<-sim.history(pbtree(n=26,tip.label=LETTERS,
scale=1),anc="a",Q)
## Done simulation(s).
This is a simulated history - but this would obviously normally be used
with a stochastically mapped discrete character history generated, for
example, with the phytools function
make.simmap
.
I'm going to plot the tree with it's mapped discrete character history - but I
will also use the function markChanges
that I just posted on
my blog
here
so that we can see the temporal position of the changes a little bit more
easily:
plotSimmap(tree,mar=c(4.1,0.1,0.1,0.1))
## no colors provided. using the following legend:
## a b c
## "black" "red" "green3"
axis(1)
## so we can see the changes more clearly:
markChanges<-function(tree,colors=NULL,cex=1,lwd=2){
states<-sort(unique(getStates(tree)))
if(is.null(colors)) colors<-setNames(palette()[1:length(states)],
states)
obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
nc<-sapply(tree$maps,length)-1
ii<-which(nc>0)
nc<-nc[ii]
h<-vector()
for(i in 1:length(ii)){
for(j in 1:nc[i]){
ss<-names(tree$maps[[ii[i]]])[j+1]
mm<-tree$edge[ii[i],1]
dd<-tree$edge[ii[i],2]
x<-rep(obj$xx[mm]+cumsum(tree$maps[[ii[i]]])[j],2)
y<-c(obj$yy[dd]-0.5*mean(strheight(LETTERS)*cex),
obj$yy[dd]+0.5*mean(strheight(LETTERS)*cex))
lines(x,y,lwd=lwd,col=colors[ss],lend=2)
h<-c(h,x[1])
}
}
invisible(h)
}
markChanges(tree,lwd=3)
Now we can use some of the same tricks in the function above to get the heights of all the changes in the tree, along with their types:
x<-getStates(tree,"tips")
states<-sort(unique(x))
m<-length(states)
ct<-sapply(states,function(x,y) sapply(y,function(y,x)
paste(x,"->",y,sep=""),x=x),y=states)
ct
## a b c
## a "a->a" "b->a" "c->a"
## b "a->b" "b->b" "c->b"
## c "a->c" "b->c" "c->c"
ii<-matrix(TRUE,m,m)
diag(ii)<-rep(FALSE,m)
changes<-vector(mode="list",length=m*(m-1))
names(changes)<-as.vector(ct[ii])
nc<-sapply(tree$maps,length)-1
ind<-which(nc>0)
nc<-nc[ind]
H<-nodeHeights(tree)
maps<-tree$maps[ind]
for(i in 1:length(maps)){
for(j in 1:nc[i]){
sc<-paste(names(maps[[i]])[j:(j+1)],collapse="->")
h<-H[ind[i],1]+cumsum(maps[[i]])[j]
changes[[sc]]<-c(changes[[sc]],as.numeric(h))
}
}
changes<-changes[!sapply(changes,is.null)]
changes<-lapply(changes,sort)
changes
## $`a->b`
## [1] 0.3263951 0.5449173 0.5896102 0.6901126
##
## $`b->a`
## [1] 0.4872248
##
## $`b->c`
## [1] 0.6666082 0.8735417
We should be able to see by plotting the changes on the tree that our computed heights correspond with the mapped changes on our plotted tree:
plotSimmap(tree,ylim=c(0,Ntip(tree)),mar=c(4.1,0.1,0.1,0.1))
## no colors provided. using the following legend:
## a b c
## "black" "red" "green3"
xx<-markChanges(tree,lwd=3)
axis(1)
obj<-sapply(xx,function(x) lines(rep(x,2),par()$usr[3:4],
lty="dashed",col="grey"))
obj<-sapply(changes,function(x) points(x,rep(par()$usr[3]+0.2,
length(x))))
add.simmap.legend(colors=setNames(palette()[1:m],states),x=0.05,
y=0,vertical=FALSE,prompt=FALSE)
markChanges(tree,lwd=3)
Of course, we can duplicate this across a sample of trees from a posterior sample; however we need to be cognizant of the fact that we are sure to end up with different numbers of changes of each type in each different tree.
That's it.
Any thoughts on how to do this to a multiSimmap? I'm kinda stumped.
ReplyDeleteI just realized i might have been unclear in my question. The goal would be a single matrix like `changes` above, but of the timing of all the transitions in the multiSimmap object. Thanks!
Delete