Tuesday, February 26, 2013

On the shape of trees with random taxa addition or subtraction

Since I have a new function for add tips at random to a tree; and it is even easier to write a function to drop tips randomly from the tree - i.e., here it is:
drop.random<-function(tree,n=1)
tree<-drop.tip(tree,tip=sample(tree\$tip.label)[1:n])
I thought it might be fun to look at the effects of adding or subtracting tips at random from the tree. We already know that random missing taxa tends to create trees with longer than expected terminal edges - seemingly a slow-down in the rate of lineage diversification through time.

This is not very scientific, but first let's look at the LTT and Pybus & Harvey's (2000) γ for a single tree that we initiate with 100 species and then add to randomly using add.random in increments of 10. Here's our code (minus creating the video):
require(phytools)
tree<-pbtree(n=100,scale=1)
mas<-900
for(i in 1:(mas/10+1)){
x<-ltt(tree,plot=FALSE)
plot(x\$times[2:length(x\$times)],x\$ltt[2:length(x\$ltt)], xlab="time",ylab="lineages",log="y",type="l",xlim=c(0,1), ylim=c(2,max(x\$ltt)))
lines(c(0,1),c(2,max(x\$ltt)),lty=2)
text(x=0,y=max(x\$ltt),paste("N = ",length(tree\$tip), "\n","gamma = ",round(x\$gamma,3),sep=""),adj=c(0,1))
}

And here's the result:

OK, next, let's do the opposite - start with 1000 taxa and drop taxa random. First the code:
drop.random<-function(tree,n=1)    tree<-drop.tip(tree,tip=sample(tree\$tip.label)[1:n])
require(phytools)
tree<-pbtree(n=1000,scale=1)
menos<-900
for(i in 1:(menos/10+1)){
x<-ltt(tree,plot=FALSE)
plot(x\$times[2:length(x\$times)],x\$ltt[2:length(x\$ltt)],xlab="time", ylab="lineages",log="y",type="l",xlim=c(0,1), ylim=c(2,max(x\$ltt)))
lines(c(0,1),c(2,max(x\$ltt)),lty=2)
text(x=0,y=max(x\$ltt),paste("N = ",length(tree\$tip), "\n","gamma = ",round(x\$gamma,3),sep=""),adj=c(0,1))
tree<-drop.random(tree,n=10)
}

And the video:

Adding taxa at random, at least by our algorithm, does not seem to affect tree shape all that much; but subtracting random tips, as we expected, makes γ turn progressively more and more negative.

To see if this is idiosyncratic to the specific trees we started with, why don't we replicate the entire process (i.e., 900% addition or subtraction) in a single step for, say, 30 random pure-birth trees. Here's what that looks like. First, adding tips randomly:
XX<-matrix(NA,31,2,dimnames=list(NULL,c("gamma100","gamma1000")))
require(phytools)
for(i in 1:31){
tree<-pbtree(n=100)
XX[i,1]<-ltt(tree,plot=F)\$gamma
XX[i,2]<-ltt(tree,plot=F)\$gamma
}
colMeans(XX)
ZZ<-hist(XX[,2]-XX[,1],plot=F,breaks=-9.5:0.5)
barplot(ZZ\$density,names=ZZ\$mids,space=0,main=expression(gamma[100]-gamma[1000]))

And the results:
> colMeans(XX)
gamma100  gamma1000
0.3568899 -1.9163830
And if we do the same thing dropping tips, here are the results:
> colMeans(YY)
gamma100  gamma1000
-6.1072618 0.2275565
So, on the face of it, it seems as though adding taxa randomly (from 100 to 1000 species in the tree) or dropping taxa (from 1000 to 100) both result in a decrease in γ - however the scale of decrease is highly asymmetrical, with random subtraction resulting in a much greater decrease in γ.

At the start of this little experiment I wasn't sure if random addition of taxa would increase or decrease γ, and in the end it seems to decrease γ but sometimes only a little.