Saturday, August 27, 2016

Finding the association between two trees when some taxa are missing or when each host lineage matches with a clade of parasites

I've been playing testing the optimization method for tip matching in the cophylo method for co-phylogenetic plotting. Although a formal proof is lacking (& not forthcoming!), I have shown via simulation that when trees match precisely in their topology and labels, cophylo can successfully find this matching - even for large trees that are completely scrambled a priori.

This also seems to be true when some lineages are missing from one or the other tree - for instance when cospeciation is perfect, but some taxa have been lost due to extinction or missing taxa; and when cospeciation is perfect, but host lineages might possess multiple, divergent parasite races, themselves forming a monophyletic group - such as when 'speciation' for the parasite lineages occurs more rapidly than in the hosts.

Below, I demonstrate both of these cases.

First, with some missing lineages. Let's start by simulating:

library(phytools)
## simulate matching trees
t1<-t2<-rtree(n=100)
## rotate nodes randomly in one of the trees
for(i in 1:100) t2<-rotateNodes(t2,sample(1:t2$Nnode+Ntip(t2),1))
## drop lineages from each tree at random
t1<-drop.tip(t1,sample(t1$tip.label,40))
t2<-drop.tip(t2,sample(t2$tip.label,30))
## these trees are fully scrambled!
plot(cophylo(t1,t2,rotate=FALSE),fsize=0.7)

plot of chunk unnamed-chunk-1

## match them by rotating nodes
obj<-cophylo(t1,t2)
## Rotating nodes to optimize matching...
## Done.
obj
## Object of class "cophylo" containing:
## 
## (1) 2 (possibly rotated) phylogenetic trees in an object of class "multiPhylo".
## 
## (2) A table of associations between the tips of both trees.
plot(obj,fsize=0.7)

plot of chunk unnamed-chunk-1

Next, let's imagine a case in which cospeciation is perfect, but in which each host species matches with a clade of parasites (for instance). We can simulate this by first generating host & parasite trees that match, then slicing the host tree at a time point (say, when 26 lineages are extant). Next we compute all the descendants from that point in each subtree. These are our parasite clades.

## simulate trees
host<-parasite<-pbtree(n=100,scale=10)
## slice to get all subtrees
LTT<-ltt(host,plot=FALSE)
t<-LTT$times[which(LTT$ltt==26)]
subs<-treeSlice(parasite,t,trivial=TRUE)
hosts<-sapply(subs,function(x) x$tip.label[1])
host<-drop.tip(host,setdiff(host$tip.label,hosts))
host$tip.label<-LETTERS
for(i in 1:Ntip(host)){
    if(i==1) assoc<-cbind(rep(host$tip.label[i],Ntip(subs[[i]])),
            subs[[i]]$tip.label)
    else {
        assoc<-rbind(assoc,
            cbind(rep(host$tip.label[i],Ntip(subs[[i]])),
            subs[[i]]$tip.label))
    }
}
host
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  A, B, C, D, E, F, ...
## 
## Rooted; includes branch lengths.
parasite
## 
## Phylogenetic tree with 100 tips and 99 internal nodes.
## 
## Tip labels:
##  t1, t4, t50, t51, t23, t91, ...
## 
## Rooted; includes branch lengths.
assoc
##        [,1] [,2]  
##   [1,] "A"  "t1"  
##   [2,] "B"  "t4"  
##   [3,] "C"  "t50" 
##   [4,] "C"  "t51" 
##   [5,] "D"  "t23" 
##   [6,] "D"  "t91" 
##   [7,] "D"  "t92" 
##   [8,] "E"  "t20" 
##   [9,] "E"  "t21" 
##  [10,] "F"  "t61" 
##  [11,] "F"  "t62" 
##  [12,] "F"  "t15" 
##  [13,] "G"  "t14" 
##  [14,] "G"  "t16" 
##  [15,] "G"  "t38" 
##  [16,] "G"  "t39" 
##  [17,] "G"  "t30" 
##  [18,] "H"  "t2"  
##  [19,] "I"  "t36" 
##  [20,] "I"  "t37" 
##  [21,] "I"  "t33" 
##  [22,] "I"  "t34" 
##  [23,] "J"  "t8"  
##  [24,] "J"  "t22" 
##  [25,] "J"  "t73" 
##  [26,] "J"  "t74" 
##  [27,] "J"  "t48" 
##  [28,] "K"  "t5"  
##  [29,] "L"  "t6"  
##  [30,] "M"  "t55" 
##  [31,] "M"  "t56" 
##  [32,] "N"  "t99" 
##  [33,] "N"  "t100"
##  [34,] "N"  "t64" 
##  [35,] "N"  "t52" 
##  [36,] "N"  "t17" 
##  [37,] "N"  "t40" 
##  [38,] "N"  "t41" 
##  [39,] "O"  "t19" 
##  [40,] "O"  "t68" 
##  [41,] "O"  "t69" 
##  [42,] "O"  "t18" 
##  [43,] "P"  "t25" 
##  [44,] "P"  "t43" 
##  [45,] "P"  "t44" 
##  [46,] "P"  "t70" 
##  [47,] "P"  "t71" 
##  [48,] "P"  "t63" 
##  [49,] "P"  "t11" 
##  [50,] "P"  "t10" 
##  [51,] "P"  "t9"  
##  [52,] "Q"  "t13" 
##  [53,] "Q"  "t46" 
##  [54,] "Q"  "t47" 
##  [55,] "Q"  "t42" 
##  [56,] "R"  "t28" 
##  [57,] "R"  "t29" 
##  [58,] "S"  "t49" 
##  [59,] "S"  "t89" 
##  [60,] "S"  "t90" 
##  [61,] "T"  "t26" 
##  [62,] "T"  "t27" 
##  [63,] "U"  "t45" 
##  [64,] "U"  "t79" 
##  [65,] "U"  "t80" 
##  [66,] "U"  "t97" 
##  [67,] "U"  "t98" 
##  [68,] "U"  "t12" 
##  [69,] "V"  "t7"  
##  [70,] "W"  "t57" 
##  [71,] "W"  "t58" 
##  [72,] "W"  "t59" 
##  [73,] "W"  "t81" 
##  [74,] "W"  "t82" 
##  [75,] "W"  "t60" 
##  [76,] "W"  "t32" 
##  [77,] "X"  "t24" 
##  [78,] "X"  "t75" 
##  [79,] "X"  "t76" 
##  [80,] "X"  "t77" 
##  [81,] "X"  "t78" 
##  [82,] "X"  "t65" 
##  [83,] "X"  "t66" 
##  [84,] "X"  "t67" 
##  [85,] "X"  "t95" 
##  [86,] "X"  "t96" 
##  [87,] "Y"  "t53" 
##  [88,] "Y"  "t54" 
##  [89,] "Y"  "t35" 
##  [90,] "Y"  "t93" 
##  [91,] "Y"  "t94" 
##  [92,] "Y"  "t85" 
##  [93,] "Y"  "t86" 
##  [94,] "Y"  "t72" 
##  [95,] "Y"  "t83" 
##  [96,] "Y"  "t84" 
##  [97,] "Y"  "t87" 
##  [98,] "Y"  "t88" 
##  [99,] "Y"  "t31" 
## [100,] "Z"  "t3"
## scramble order of parasites
for(i in 1:100) parasite<-rotateNodes(parasite,sample(1:parasite$Nnode+
    Ntip(parasite),1))
## they are fully scrambled
plot(cophylo(host,parasite,assoc=assoc,rotate=FALSE),
    fsize=c(1,0.5))

plot of chunk unnamed-chunk-2

## rotate to optimize matching
obj<-cophylo(host,parasite,assoc=assoc)
## Rotating nodes to optimize matching...
## Done.
plot(obj,fsize=c(1,0.5),link.type="curved")

plot of chunk unnamed-chunk-2

Pretty neat, right?

3 comments:

  1. Very useful indeed. Is there a way to display the association lines in a different color? I tried plot(obj, link.type="curved", col="red"), but it didn't work.

    ReplyDelete
    Replies
    1. I believe the argument is link.col. Sorry it's not better documented.

      Delete
  2. Jeg synes, at dette bør være den billigste nike replika jeg nogensinde har købt.Køb Nike Air Max sko online Jeg deler denne hjemmeside med mine venner.købeair nike max 1 De er meget glade, prisen er så billig, og de kan købe sådanne gode sko.

    ReplyDelete