Tuesday, January 3, 2023

Alternative algorithms for plotting phylogenies in R

Yesterday I posted an algorithm for plotting trees in which we drew every path from the root to any tip as a segmented line through each ancestral node.

This may not be the most computationally efficient way to plot a tree, but one possible advantage of this approach, I thought might be to avoid the overplotting at “joints” that occurs when each horizontal & vertical edge in the graphed tree is plotted as a separate line or segment. This overplotting only tends to become visible when we use semi-transparent colors – or if we use so-called “butt” line endings (lend=1).

library(phytools)
library(phangorn)
data(salamanders)
par(mfrow=c(1,2))
plotTree(salamanders,color=make.transparent("black",0.5),lwd=6,
	ftype="i",mar=c(0.1,0.1,2.1,0.1))
mtext("a) semi-transparent overplotting",line=0,adj=0.1)
plotTree(salamanders,color=palette()[4],lwd=6,ftype="i",
	lend=1,mar=c(0.1,0.1,2.1,0.1))
mtext("b) \"butt\" line caps",line=0,adj=0.1)

plot of chunk unnamed-chunk-1

Here’s a simple function based on my previous blog post.

lineTree<-function(...){
	args<-list(...)
	col<-if(!is.null(args$color)) args$color else par()$fg
	args$color<-"transparent"
	do.call(plotTree,args)
	lwd<-if(!is.null(args$lwd)) args$lwd else par()$lwd
	obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
	phy<-list(edge=obj$edge,Nnode=obj$Nnode,
		tip.label=1:obj$Ntip)
	attr(phy,"class")<-"phylo"
	for(i in 1:Ntip(phy)){
		aa<-c(i,Ancestors(phy,i))
		lines(obj$xx[aa],obj$yy[aa],col=col,type="s",
			lwd=lwd)
	}
}
par(lend=1,ljoin=2)
lineTree(salamanders,lwd=6,color=palette()[4],ftype="i")

plot of chunk unnamed-chunk-2

We can see straight away that this plots much more cleanly – and even allows us to use a ‘beveled’ line joint (ljoin=2).

Unfortunately, this doesn’t obviate our overplotting problem as we’re actually overplotting all the edges of our graphs, except for the tips.

To see that, let’s set our color to be semi-transparent as follows.

par(lend=1,ljoin=2)
lineTree(salamanders,lwd=6,
	color=make.transparent(color=palette()[4],0.2),
	ftype="i")

plot of chunk unnamed-chunk-3

As an alternative solution to the same problem, let’s just record the line segments that we’ve plotted, and then avoid overplotting them – graphing from an internal node to each tip.

To get an idea of how this might work, let’s plot each separately graphed line using a different color and line width. We should see that no/minimal overplotting occurs.

set.seed(7)
lineTree.2<-function(...){
	args<-list(...)
	args$color<-"transparent"
	do.call(plotTree,args)
	obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
	nm<-nrow(obj$edge)
	draw<-1:max(obj$edge)
	phy<-list(edge=obj$edge,Nnode=obj$Nnode,
		tip.label=1:obj$Ntip)
	attr(phy,"class")<-"phylo"
	for(i in 1:Ntip(phy)){
		aa<-c(i,intersect(Ancestors(phy,i),draw))
		capture.output(aa<-c(aa,getParent(phy,
			aa[length(aa)])))
		xx<-obj$xx[aa]
		yy<-obj$yy[aa]
		lines(xx,yy,lwd=sample(1:5,1),
			col=sample(palette(),1),
			type="s")
		draw<-setdiff(draw,aa)
		Sys.sleep(0.1)
	}
}
data(mammal.tree)
par(lend=1,ljoin=2)
lineTree.2(mammal.tree,fsize=0.7,ftype="i")

plot of chunk unnamed-chunk-5

This code chunk also includes a Sys.sleep lag, causing lines to appear one-by-one instead of all at once, which (if animated) would look something like this.

line phylogeny plot

In this animation we can clearly see how the tree is being build by adding path from the node to each tip, one by one.

Now here’s a version without the wonky colors & time delay.

lineTree.3<-function(...){
	args<-list(...)
	col<-if(!is.null(args$color)) args$color else par()$fg
	lwd<-if(!is.null(args$lwd)) args$lwd else par()$lwd
	args$color<-"transparent"
	do.call(plotTree,args)
	obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
	nm<-nrow(obj$edge)
	draw<-1:max(obj$edge)
	phy<-list(edge=obj$edge,Nnode=obj$Nnode,
		tip.label=1:obj$Ntip)
	attr(phy,"class")<-"phylo"
	for(i in 1:Ntip(phy)){
		aa<-c(i,intersect(Ancestors(phy,i),draw))
		capture.output(aa<-c(aa,getParent(phy,
			aa[length(aa)])))
		xx<-obj$xx[aa]
		yy<-obj$yy[aa]
		lines(xx,yy,type="s",col=col,lwd=lwd)
		draw<-setdiff(draw,aa)
	}
}
data(vertebrate.tree)
par(ljoin=2,lend=1)
lineTree.3(vertebrate.tree,ftype="i",fsize=1.1,lwd=8,
	color=make.transparent("black",0.5))

plot of chunk unnamed-chunk-7

I’m not sure what to say here – our overplotting problem is almost gone, but not completely gone, which is a bit disappointing.

Lastly, just for fun, here is a “spline” tree that uses cubic spline (from smooth.spline) to interpolate the path between graphed nodes (but otherwise a similar plotting technique to what we’ve seen above).

splineTree<-function(tree,...){
	h<-min(tree$edge.length)/4
	tree<-make.era.map(tree,seq(0,max(nodeHeights(tree)),by=h))
	tree<-map.to.singleton(tree)
	args<-list(...)
	args$tree<-tree
	args$color<-"transparent"
	do.call(plotTree,args)
	obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
	phy<-list(edge=obj$edge,Nnode=obj$Nnode,
		tip.label=1:obj$Ntip)
	attr(phy,"class")<-"phylo"
	for(i in 1:Ntip(phy)){
		aa<-c(i,Ancestors(phy,i))
		xx<-obj$xx[aa]
		yy<-obj$yy[aa]
		tmp<-predict(smooth.spline(xx,yy,
			df=min(max(round(0.1*length(xx)),4),
			length(xx))))
		tmp$x<-c(xx[length(xx)],tmp$x,xx[1])
		tmp$y<-c(yy[length(yy)],tmp$y,yy[1])
		lines(tmp)
	}
}
splineTree(mammal.tree,fsize=0.7,ftype="i")

plot of chunk unnamed-chunk-8

That’s it!

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.