Thursday, May 25, 2017

Add a 'shadow' visual effect to a plotted tree in R

Today I saw a cool visual effect in a plotted phytools object, although I suspect it was achieved by posthoc editing the plot in Adobe Illustrator rather than in R.

The effect was to add a 'shadow' to a plotted tree. Here is a quick demo of how to do it in R. I have illustrated the method using a plotted "contMap" object, because it is a little trickier, but we could do the same thing with any old plotted tree:

## load phytools
library(phytools)
## here is our tree & data
tree
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  A, B, C, D, E, F, ...
## 
## Rooted; includes branch lengths.
x
##           A           B           C           D           E           F 
## -1.28663899 -2.18638418 -1.76955549 -1.85871833  0.92086981 -2.31821640 
##           G           H           I           J           K           L 
##  3.04198683  1.41705375  1.35059487 -0.03275257  0.19923947  0.09906219 
##           M           N           O           P           Q           R 
##  0.84486447  0.06387185  0.27210453  1.02166697  0.85217792 -0.31925792 
##           S           T           U           V           W           X 
## -1.17999809  2.67084163  3.05705763  2.77059737  7.84958715  5.18219507 
##           Y           Z 
##  3.78072083  4.69439548
## get our "contMap" object
obj<-contMap(tree,x,plot=FALSE,res=1000)
## now this is really just a test plot that we are going to
## compute to get our x & y limits
plot(obj,lwd=6)

plot of chunk unnamed-chunk-1

lastPP<-get("last_plot.phylo",envir=.PlotPhyloEnv)
## now our real plot:
par(fg="transparent")
plotTree(obj$tree,color="darkgrey",lwd=8,xlim=lastPP$x.lim,ylim=lastPP$y.lim)
xlim<-lastPP$x.lim-0.004*diff(lastPP$x.lim)
ylim<-lastPP$y.lim-0.004*diff(lastPP$y.lim)
## this is not for the shadow, but to get an outline on our
## plotted contMap tree:
plotTree(obj$tree,lwd=8,xlim=xlim,ylim=ylim,add=TRUE)
par(fg="black")
plot(obj$tree,colors=obj$cols,lwd=6,add=TRUE,xlim=xlim,ylim=ylim)
## finally, add our legend, also with a shadow
lines(c(0,0.5*max(nodeHeights(obj$tree)))-0.004*diff(lastPP$x.lim),
    c(-1,-1)-0.004*diff(lastPP$y.lim),lwd=8,lend=2,col="darkgrey")
add.color.bar(leg=0.5*max(nodeHeights(obj$tree)),cols=obj$cols,
    title="trait value",prompt=FALSE,x=0,y=-1,lims=obj$lims,lwd=6)

plot of chunk unnamed-chunk-1

Pretty neat - I think.

The tree & data for this example were simulated as follows:

tree<-pbtree(n=26,tip.label=LETTERS)
x<-fastBM(tree)

Tuesday, May 23, 2017

Type I error rates in ratebytree method to compare evolutionary rates between trees

Today I'm working on a more rigorous exploration of the statistical properties of the phytools function ratebytree.

As I have described in prior posts (e.g., 1, 2, , 4, 5), this is a method for comparing the evolutionary rate of a continuous character between trees. This method is a variant of Brian O'Meara et al.'s (2006) censored rates test. In fact, I don't really consider this to be a separate method at all - merely the application of the O'Meara et al. method to a particular problem.

In the following example, I simulate trees & data under a variety of different scenarios of tree size, but consistently under the null hypothesis of no difference in rate between trees. Then I fit the model to each set of simulated trees & datasets, extract the P-values, and compute the fraction of times that the hypothesis test resulted in a value of P that was less than or equal to 0.05.

For R enthusiasts, the script below also includes some useful examples of apply family functions, expression to put sub/superscripts in figure legends, and mtext to artfully position our subplot labels.

library(phytools)

set.seed(1)

nrep<-1000
t10<-replicate(nrep,pbtree(n=10,nsim=2),simplify=FALSE)
x10<-lapply(t10,function(x) lapply(x,fastBM))
t50<-replicate(nrep,pbtree(n=50,nsim=2),simplify=FALSE)
x50<-lapply(t50,function(x) lapply(x,fastBM))
foo<-function(n1,n2) c(pbtree(n=n1),pbtree(n=n2))
t10.t50<-replicate(nrep,foo(10,50),simplify=FALSE)
x10.x50<-lapply(t10.t50,function(x) lapply(x,fastBM))
t30<-replicate(nrep,pbtree(n=30,nsim=3),simplify=FALSE)
x30<-lapply(t30,function(x) lapply(x,fastBM))

typeIerror<-function(i,t,x,max){
    obj<-ratebytree(t,x)
    if(i==1) cat("|.") else cat(".")
    if(i==max) cat("|\n")
    if(i%%50==0) cat("\n ")
    flush.console()
    obj$P.chisq
}

par(mfrow=c(2,2))

P10<-mapply(typeIerror,1:nrep,t10,x10,MoreArgs=list(max=nrep))
## |..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................|
## 
## 
h10<-hist(P10,breaks=seq(0,1,by=0.05),col="grey",xlab="P-value",
    main="",freq=FALSE)
mtext(text=expression('a) N'[1]*'=N'[2]*'=10'),adj=0,line=1,
    cex=1)
abline(h=1,lwd=2,col="red",lty="dashed")

P50<-mapply(typeIerror,1:nrep,t50,x50,MoreArgs=list(max=nrep))
## |..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................|
## 
## 
h50<-hist(P50,breaks=seq(0,1,by=0.05),col="grey",main="",
    xlab="P-value",freq=FALSE)
mtext(text=expression('b) N'[1]*'=N'[2]*'=50'),adj=0,line=1,
    cex=1)
abline(h=1,lwd=2,col="red",lty="dashed")

P10.50<-mapply(typeIerror,1:nrep,t10.t50,x10.x50,MoreArgs=list(max=nrep))
## |..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................|
## 
## 
h10.50<-hist(P10.50,breaks=seq(0,1,by=0.05),col="grey",main="",freq=FALSE,
    xlab="P-value")
mtext(text=expression('c) N'[1]*'=10, N'[2]*'=50'),adj=0,line=1,
    cex=1)
abline(h=1,lwd=2,col="red",lty="dashed")

P30<-mapply(typeIerror,1:nrep,t30,x30,MoreArgs=list(max=nrep))
## |..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................
##  ..................................................|
## 
## 
h30<-hist(P30,breaks=seq(0,1,by=0.05),col="grey",freq=FALSE,main="",
    xlab="P-value")
mtext(text=expression('d) N'[1]*'=N'[2]*'=N'[3]*'=30'),
    adj=0,line=1,cex=1)
abline(h=1,lwd=2,col="red",lty="dashed")

plot of chunk unnamed-chunk-1

typeI<-matrix(sapply(list(P10,P50,P10.50,P30),function(x,alpha)
    mean(x<=alpha),alpha=0.05),4,1,dimnames=list(
    c("N1=N2=10","N1=N2=50","N1=10,N2=50","N1=N2=N3=30"),
    "type I error"))
typeI
##             type I error
## N1=N2=10           0.065
## N1=N2=50           0.052
## N1=10,N2=50        0.056
## N1=N2=N3=30        0.063

If the method is working properly, under the null we should see a relatively uniform density on the interval [0,1]. We should also see a type I error rate close to the nominal level. In this case, our type I error rate is slightly elevated for the smallest tree sizes - but for even our relatively modestly sized trees of panels b) & c) there is relatively little evidence for elevated type I error over the nominal level in this test.

That's it.

Monday, May 22, 2017

arc.cladelabels using the names of descendant tips rather than the node numbers

Today a phytools blog reader commented:

“Liam, these clade labels are fantastic! But, I wonder if there is a possible way to determine clades by the tips (instead of by the nodes), since some clades could have just one branch.”

The answer is 'kind of.' That is, we can't give the function an argument that consists of the tip labels of our clade; however, we could specify the node using the descendant tips using a call to the function findMRCA.

For example:

library(phytools)
tree
## 
## Phylogenetic tree with 64 tips and 63 internal nodes.
## 
## Tip labels:
##  t63, t64, t47, t48, t43, t44, ...
## 
## Rooted; includes branch lengths.
plotTree(tree,type="fan",fsize=0.8,xlim=c(-4.1,4.1))

plot of chunk unnamed-chunk-1

If we want to label the clade containing t63, t64, t47, t48, t43, and t44 as clade A, then we can do it as follows:

plotTree(tree,type="fan",fsize=0.8,xlim=c(-4.1,4.1))
arc.cladelabels(text="clade A",node=findMRCA(tree,
    c("t63","t64","t47","t48","t43","t44")),ln.offset=1.1,
    lab.offset=1.16)

plot of chunk unnamed-chunk-2

We need not list all the taxa in the clade we want to label. For instance, if we want to label the clade containing taxa t46, t45, through t19, we only need list a set of tips for whom the MRCA is also the ancestor of the clade. For example:

plotTree(tree,type="fan",fsize=0.8,xlim=c(-4.1,4.1))
arc.cladelabels(text="clade B",node=findMRCA(tree,c("t46","t19")),
    ln.offset=1.1,lab.offset=1.16)
## and our previous label:
arc.cladelabels(text="clade A",node=findMRCA(tree,
    c("t63","t64","t47","t48","t43","t44")),ln.offset=1.1,
    lab.offset=1.16)

plot of chunk unnamed-chunk-3

Finally, if we want to label just one tip - well, we can just specify the node number of that tip. For instance to label the clade of tip t8 we can do:

plotTree(tree,type="fan",fsize=0.8,xlim=c(-4.1,4.1))
arc.cladelabels(text="clade C",node=which(tree$tip.label=="t8"),
    orientation="horizontal",ln.offset=1.1,lab.offset=1.12)
## with our other labels:
arc.cladelabels(text="clade A",node=findMRCA(tree,
    c("t63","t64","t47","t48","t43","t44")),ln.offset=1.1,
    lab.offset=1.16)
arc.cladelabels(text="clade B",node=findMRCA(tree,c("t46","t19")),
    ln.offset=1.1,lab.offset=1.16)

plot of chunk unnamed-chunk-4

Finally, those little red dots are just designed to keep track of the clade we are labeling. We can turn that off of course:

plotTree(tree,type="fan",fsize=0.8,xlim=c(-4.1,4.1),ftype="i")
arc.cladelabels(text="clade A",node=findMRCA(tree,
    c("t63","t64","t47","t48","t43","t44")),ln.offset=1.1,
    lab.offset=1.16,mark.node=FALSE)
arc.cladelabels(text="clade B",node=findMRCA(tree,c("t46","t19")),
    ln.offset=1.1,lab.offset=1.16,mark.node=FALSE)
arc.cladelabels(text="clade C",node=which(tree$tip.label=="t8"),
    orientation="horizontal",ln.offset=1.1,lab.offset=1.12,
    mark.node=FALSE)

plot of chunk unnamed-chunk-5

That's it.