Saturday, February 6, 2016

Coloring tip labels of a traitgram using phenogram

A phytools user requested instructions on how to paint the labels of a plotted traitgram (from phytools function phenogram) using different colors. Naively, this might seem straightforward now that phenogram assigns the environmental variable, "last_plot.phylo". Unfortunately, it is not that easy because the default is to use a numerical algorithm to optimize the spacing of the tip labels, and these new, spaced positions are not exported.

Consequently, I have now updated phenogram to invisibly return a matrix containing the x & y coordinates of each plotted tip label. Overlaying new tip labels is still not entirely trivial, because we want to overlay them precisely on our existing plotted labels (and we're going to do it with shadowtext in the TeachingDemos package to ensure that we don't see any part of the old label peeking out).

Here's a demo in which I color the tip labels by a discrete character:

library(phytools)
tree
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  A, B, C, D, E, F, ...
## 
## Rooted; includes branch lengths.
x
##       A       B       C       D       E       F       G       H       I 
##  2.8600  2.1058  1.7397  2.8509 -1.8374  1.6606  5.6038  4.1693 -2.5189 
##       J       K       L       M       N       O       P       Q       R 
## -1.1314  4.9547  4.3946  5.8012  4.9661  6.0724  4.6994  3.5680  3.9083 
##       S       T       U       V       W       X       Y       Z 
##  1.8861  0.8216 -0.3356 -1.0667 -3.4524 -2.8598 -3.0773 -0.6128
y
## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
## a b b b a b a a a b a a a a b c c c c b a b a b b b 
## Levels: a b c
cols<-setNames(c("blue","magenta","red"),
    sort(unique(y)))
label.colors<-setNames(cols[y],names(y))
obj<-phenogram(tree,x)
obj ## coordinates of the labels
##    x          y
## A 11  2.9583965
## B 11  2.2101406
## C 11  1.6551344
## D 11  2.6603615
## E 11 -1.8427531
## F 11  1.3776016
## G 11  5.5009397
## H 11  4.0677803
## I 11 -2.5065820
## J 11 -1.2872559
## K 11  4.9024095
## L 11  4.3454942
## M 11  5.7884745
## N 11  5.1807178
## O 11  6.0724000
## P 11  4.6233649
## Q 11  3.5061471
## R 11  3.7901399
## S 11  1.9331497
## T 11  0.7719063
## U 11 -0.3396495
## V 11 -0.9411745
## W 11 -3.4524000
## X 11 -2.8206221
## Y 11 -3.1061709
## Z 11 -0.6229529
library(TeachingDemos)
lastPP<-get("last_plot.phylo",envir=.PlotPhyloEnv) ## get offset
shadowtext(x=obj,labels=rownames(obj),
    col=label.colors[rownames(obj)],bg="white",pos=4,
    offset=lastPP$label.offset)
## overlay tip states to show they match
tiplabels(pie=to.matrix(y[tree$tip.label],seq=sort(unique(y))),
    piecol=cols,cex=0.4)

plot of chunk unnamed-chunk-1

This latest version of phytools can be installed as in my previous post today.

Update to phenogram to be used with nodelabels, tiplabels (plus a small bug fix)

I just added the feature that the traitgram plotting function in phytools, phenogram, will now also assign the environmental variable "last_plot.phylo" which will allow users to add things like tip, node, and edge labels to a plotted traitgram created using this function.

Details of this update can be viewed on the phytools GitHub page here, along with a fix for a bug in phenogram(...,spread.labels=TRUE) when the labels are non-overlapping to begin with.

Here's a demo in which I will overlay the observed & estimated ancestral values for a discrete character onto a plotted traitgram:

library(phytools)
packageVersion("phytools")
## [1] '0.5.16'
tree
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  A, B, C, D, E, F, ...
## 
## Rooted; includes branch lengths.
x
##       A       B       C       D       E       F       G       H       I 
##  2.8600  2.1058  1.7397  2.8509 -1.8374  1.6606  5.6038  4.1693 -2.5189 
##       J       K       L       M       N       O       P       Q       R 
## -1.1314  4.9547  4.3946  5.8012  4.9661  6.0724  4.6994  3.5680  3.9083 
##       S       T       U       V       W       X       Y       Z 
##  1.8861  0.8216 -0.3356 -1.0667 -3.4524 -2.8598 -3.0773 -0.6128
y
## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
## a b b b a b a a a b a a a a b c c c c b a b a b b b 
## Levels: a b c
phenogram(tree,x)
obj<-rerootingMethod(tree,y,model="SYM")
cols<-setNames(c("blue","magenta","red"),
    colnames(obj$marginal.anc))
nodelabels(pie=obj$marginal.anc,piecol=cols,cex=0.7)
tiplabels(pie=to.matrix(y,seq=colnames(obj$marginal.anc)),
    piecol=cols,cex=0.4)
add.simmap.legend(colors=cols,prompt=FALSE,x=0,y=6)

plot of chunk unnamed-chunk-1

This version of phytools can be installed as follows:

library(devtools)
install_github("liamrevell/phytools")

The data for this example were simulated using the following code:

tree<-pbtree(n=26,tip.label=LETTERS,scale=10)
x<-round(fastBM(tree),4)
Q<-matrix(c(-0.25,0.25,0,
    0.25,-0.5,0.25,
    0,0.25,-0.25),3,3)
rownames(Q)<-colnames(Q)<-letters[1:3]
y<-sim.history(tree,Q)$states
y<-setNames(as.factor(y),names(y))

Thursday, February 4, 2016

phylo.heatmap with missing values as NA

I just posted an update to phylo.heatmap that addresses a user request that the function permit missing values as NA. I did this first by just adding a bunch of na.rm=TRUE tags; however there remained a problem that in each instance that there was an NA in the data matrix to be plotted, the cell would be colored white. This is fine the color white is completely outside the range of plotted colors; however deceptively misleading if it is included, or at the margin, of this range!

My solution was to first identify the cells with NAs, and then draw a 'strikeout' line through each plotted cell, with white as the background. Here's my code to do so:

if(any(is.na(X))){
    ii<-which(is.na(X),arr.ind=TRUE)
    x.na<-seq(START,END,by=(END-START)/(ncol(X)-1))[ii[,2]]
    y.na<-seq(0,1,by=1/(nrow(X)-1))[ii[,1]]
    for(i in 1:length(x.na)){
        xx<-x.na[i]+c(1/2,-1/2)*(END-START)/(ncol(X)-1)
        yy<-y.na[i]+c(-1/2,1/2)*1/(nrow(X)-1)
        lines(xx,yy)
    }
}

I also added the option, in response to two different user comments (1, 2) to turn off the plotted points at the end of each tip of the tree (even though I like that feature).

Here's a demo:

library(phytools)
colors<-colorRampPalette(colors=c("white","black"))(100)
phylo.heatmap(tree,X,colors=colors,standardize=TRUE,lwd=3,
    pts=FALSE)

plot of chunk unnamed-chunk-2

This same argument, pts, can also be used in dotTree and in the S3 plotting method for objects of class "cophylo".

A version of phytools with these updates can be installed using devtools:

library(devtools)
install_github("liamrevell/phytools")

The data & tree were simulated as follows:

tree<-rtree(n=20)
X<-fastBM(tree,n=12)
X[sample(1:(ncol(X)*nrow(X)),3)]<-NA ## random missing data

Wednesday, February 3, 2016

plot.cophylo for very long tip labels

On the R-sig-phylo list a subscriber reported some weird behavior from the phytools plotting method for objects of class "cophylo". I don't have the original dataset, but it looked something like:

obj<-cophylo(tr1,tr2,assoc=assoc,plot=FALSE)
## Rotating nodes to optimize matching...
## Done.
plot(obj)

plot of chunk unnamed-chunk-1

I.e., the trees facing the wrong way!

This is actually easier to fix than it looks. It is really just a product of having very long labels. The function tries to fit everything, but the effect is that the edge lengths become negative. Oops.

We can either change our font size in the same size plotting device:

plot(obj,fsize=0.75)

plot of chunk unnamed-chunk-2

Or, perhaps better, we can change the size of our plotting device:

plot(obj)

plot of chunk unnamed-chunk-3

To export directly to PDF, we would do this using the height and width arguments in pdf:

pdf(file="cophylo.pdf",height=8.5,width=11)
plot(obj)
dev.off()
## windows 
##       2

BTW, the trees were generated here as follows:

library(phytools)
tips<-vector()
for(i in 1:26) tips[i]<-paste(sample(c(LETTERS,letters),25),collapse="")
tr1<-rtree(n=26,tip.label=tips)
for(i in 1:26) tips[i]<-paste(sample(c(LETTERS,letters),25),collapse="")
tr2<-rtree(n=26,tip.label=tips)
assoc<-cbind(sample(tr1$tip.label),sample(tr2$tip.label))