## Wednesday, June 17, 2015

### Calculating the times spent with each of number of lineages on the tree

Yesterday a phytools user emailed me the following inquiry:

“Is there an easy way in phytools to infer the times while there are exactly k ancestral lineages in the tree? I would need to calculate these based on a newick tree input. However, I havent found a function in phytools for it. Somthing like k=2 from x to y k=3 from z to v”

This can be done pretty easily using the phytools function `ltt`. Of course, one might allow for the possilibity that - if lineages both increase & decrease through time, there could be more than one period of time during which there is `2` lineages, `3` lineages etc.

The following gives a quick demo - first using a tree in which the number of lineages both increases & descreases in time since the root:

``````library(phytools)
tree<-rtree(26,tip.label=LETTERS)
plotTree(tree)
``````

``````## create an object with the number of lineages through time:
obj<-ltt(tree,log=FALSE)
``````

Now let's go through all the number of possible lineages and tabulate all the time intervals in which the tree has each number of lineages:

``````foo<-function(n,obj){
ii<-which(obj\$ltt==n)
ii<-ii[ii!=length(obj\$ltt)]
sapply(ii,function(i,x) c(x[i],x[i+1]),x=obj\$times)
}
Nlineages<-sapply(1:max(obj\$ltt),foo,obj=obj)
if(is.matrix(Nlineages)){
colnames(Nlineages)<-1:max(obj\$ltt)
rownames(Nlineages)<-c("start","end")
} else if(is.list(Nlineages)){
names(Nlineages)<-1:max(obj\$ltt)
for(i in 1:length(Nlineages))
rownames(Nlineages[[i]])<-c("start","end")
}
``````

Our object is a list, with a matrix for each number of lineages showing the time intervals spent with that number of lineages.

``````Nlineages
``````
``````## \$`1`
##               16
## start 0 4.128545
## end   0 4.198241
##
## \$`2`
##               27       15
## start 0.00000000 4.049935
## end   0.07445996 4.128545
##
## \$`3`
##               28       13       11
## start 0.07445996 1.501722 3.962453
## end   0.15032225 1.711749 4.049935
##
## \$`4`
##              30         1         4        2       41       22
## start 0.1503222 0.3991084 0.5425447 1.499231 1.711749 3.655242
## end   0.2915629 0.4658409 0.8433361 1.501722 2.151097 3.962453
##
## \$`5`
##              31        50        29       24       35       21
## start 0.2915629 0.4658409 0.8433361 1.464400 2.151097 3.651133
## end   0.3991084 0.5425447 0.8714423 1.499231 2.194213 3.655242
##
## \$`6`
##              51        3       37        5        9       18
## start 0.8714423 1.458014 2.194213 2.331990 3.612215 3.642776
## end   0.9738692 1.464400 2.274517 2.353893 3.618789 3.651133
##
## \$`7`
##              32       26       25       45       42       14       12
## start 0.9738692 1.096267 1.325357 2.274517 2.353893 2.607869 3.509315
## end   1.0088520 1.263415 1.458014 2.331990 2.573532 2.785814 3.612215
##             44
## start 3.618789
## end   3.642776
##
## \$`8`
##             33       34       46       38        7
## start 1.008852 1.263415 2.573532 2.785814 3.484318
## end   1.096267 1.325357 2.607869 2.810022 3.509315
##
## \$`9`
##             47       20        6
## start 2.810022 3.318845 3.382049
## end   2.953015 3.380557 3.484318
##
## \$`10`
##             39       23       19        8       40
## start 2.953015 3.103372 3.199463 3.277445 3.380557
## end   2.988247 3.114297 3.252606 3.318845 3.382049
##
## \$`11`
##             43       36       17       49
## start 2.988247 3.114297 3.179681 3.252606
## end   3.103372 3.149976 3.199463 3.277445
##
## \$`12`
##             48
## start 3.149976
## end   3.179681
``````

Next, we can try and ultrametric trees with all lineages terminating in the present:

``````tree<-pbtree(n=26,tip.label=LETTERS)
plotTree(tree)
``````

``````obj<-ltt(tree,log=FALSE)
``````

``````foo<-function(n,obj){
ii<-which(obj\$ltt==n)
ii<-ii[ii!=length(obj\$ltt)]
sapply(ii,function(i,x) c(x[i],x[i+1]),x=obj\$times)
}
Nlineages<-sapply(1:max(obj\$ltt),foo,obj=obj)
if(is.matrix(Nlineages)){
colnames(Nlineages)<-1:max(obj\$ltt)
rownames(Nlineages)<-c("start","end")
} else if(is.list(Nlineages)){
names(Nlineages)<-1:max(obj\$ltt)
for(i in 1:length(Nlineages))
rownames(Nlineages[[i]])<-c("start","end")
}
``````

This time we have a matrix in which each column is a number of lineages and the rows show the start and end times. When no lineages are lost through time, there will be only one time interval for each number of lineages:

``````Nlineages
``````
``````##           1            2         3         4        5        6        7
## start 0e+00 2.000000e-11 0.4418836 0.5968292 1.356948 1.752875 1.756543
## end   2e-11 4.418836e-01 0.5968292 1.3569481 1.752875 1.756543 1.876857
##              8        9       10       11       12       13       14
## start 1.876857 1.931771 2.000830 2.022528 2.046486 2.218467 2.283606
## end   1.931771 2.000830 2.022528 2.046486 2.218467 2.283606 2.445454
##             15       16      17       18       19       20       21
## start 2.445454 2.506024 2.57934 2.586070 2.596797 2.668859 2.671522
## end   2.506024 2.579340 2.58607 2.596797 2.668859 2.671522 2.735799
##             22       23       24       25       26
## start 2.735799 2.773744 2.779681 2.823757 2.865986
## end   2.773744 2.779681 2.823757 2.865986 2.866205
``````

Note that the tiny zeroeth timeslice with one lineage at the start of each tree is an idiosyncratic feature of `phytools::ltt` and could easily be ignored.

That's it.