As I mentioned
the
other day, during my sabbatical at the Universidad de los Andes here in
Bogotá I submitted a paper with some colleagues describing the function
ratebytree
for comparing the evolutionary rates of a continuous character between
trees.
The manuscript was favorably reviewed, but in their typically pesky* (*said with affection)
manner, reviewers asked us to add different models of evolution - the feature that I added
last
week. They also asked us to extend the approach to discrete character evolution.
This was something we pointed out could be done in the manuscript, but had not explicitly
addressed. It is theoretically straightforward - although not
completely
trivial - particularly if you want your new function updates to be accompanied by
legible print
methods & so on.
I have, however, now added this feature to the function ratebytree
. The function
can take an argument, type
(which should assume the value "continuous"
or "discrete"
); however, if not provided, the function will attempt to ascertain
the character type.
Here is a quick demo using some simple simulated data:
library(phytools)
packageVersion("phytools")
## [1] '0.6.25'
First, our trees & data:
t1
##
## Phylogenetic tree with 26 tips and 25 internal nodes.
##
## Tip labels:
## A, B, C, D, E, F, ...
##
## Rooted; includes branch lengths.
x1
## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
## c c c c a c b c b c a c b a b c b b a c a c c c c c
## Levels: a b c
t2
##
## Phylogenetic tree with 40 tips and 39 internal nodes.
##
## Tip labels:
## t7, t34, t35, t3, t29, t30, ...
##
## Rooted; includes branch lengths.
x2
## t7 t34 t35 t3 t29 t30 t31 t14 t15 t2 t39 t40 t9 t8 t21 t22 t32 t33
## b a a c b b b b c b a a c c a a c c
## t23 t24 t19 t20 t11 t5 t6 t25 t26 t4 t36 t37 t38 t12 t13 t27 t28 t16
## c c b b c c a c c a b b b c c c b c
## t1 t10 t17 t18
## a a a a
## Levels: a b c
## or
dotTree(t3,x3,ftype="off")
x<-list(x1,x2,x3)
x
## [[1]]
## A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
## c c c c a c b c b c a c b a b c b b a c a c c c c c
## Levels: a b c
##
## [[2]]
## t7 t34 t35 t3 t29 t30 t31 t14 t15 t2 t39 t40 t9 t8 t21 t22 t32 t33
## b a a c b b b b c b a a c c a a c c
## t23 t24 t19 t20 t11 t5 t6 t25 t26 t4 t36 t37 t38 t12 t13 t27 t28 t16
## c c b b c c a c c a b b b c c c b c
## t1 t10 t17 t18
## a a a a
## Levels: a b c
##
## [[3]]
## t19 t20 t3 t11 t12 t36 t37 t21 t24 t25 t10 t13 t14 t2 t7 t8 t9 t16
## a a a b a c b a a a b a a a c b a c
## t28 t29 t32 t33 t17 t34 t35 t18 t1 t39 t40 t22 t23 t41 t42 t4 t5 t26
## b b b b b b b b b c c c c b b b a b
## t27 t6 t49 t50 t15 t43 t44 t45 t46 t30 t31 t47 t48 t38
## b b c c b c c b b c c b b b
## Levels: a b c
trees<-c(t1,t2,t3)
trees
## 3 phylogenetic trees
OK, now let's try it! First, I will assume a symmetric model of transitions between states:
fitSYM<-ratebytree(trees,x,type="discrete",model="SYM")
fitSYM
## ML common-rate model:
## a<->b a<->c b<->c k logL
## value 1.3136 0.0815 3.0745 3 -100.7769
##
## Model fitting method was "optim".
##
## ML multi-rate model:
## a<->b a<->c b<->c k logL
## tree1 3.5598 0 5.2209 9 -99.2075
## tree2 0.2588 0.5751 1.5436
## tree3 1.3966 0 3.2116
##
## Model fitting method was "nlminb".
##
## Likelihood ratio: 3.1388
## P-value (based on X^2): 0.7912
This tells us that we have little evidence to reject a constant (set of) rate(s) across our trees. We can nonetheless try simpler or more complex models and see if the outcome is similar or different. Remember, I said the function would attempt to ascertain if the character is discretely or continuously-valued, so let's try that:
fitER<-ratebytree(trees,x)
fitER
## ML common-rate model:
## q k logL
## value 1.1995 1 -104.2243
##
## Model fitting method was "optimize".
##
## ML multi-rate model:
## q k logL
## tree1 2.1052 3 -103.4967
## tree2 0.8053
## tree3 1.4482
##
## Model fitting method was "nlminb".
##
## Likelihood ratio: 1.4552
## P-value (based on X^2): 0.4831
fitARD<-ratebytree(trees,x,model="ARD")
fitARD
## ML common-rate model:
## b->a c->a a->b c->b a->c b->c k logL
## value -0.0483 1.1847 2.3293 1.5367 -0.7321 3.8122 6 -99.4653
##
## Model fitting method was "optim".
##
## ML multi-rate model:
## b->a c->a a->b c->b a->c b->c k logL
## tree1 4.0703 0 4.1987 7.5158 0 16.8153 18 -94.8226
## tree2 0.1143 0.5863 0 1.2591 0.1455 2.3226
## tree3 0.6492 0.2584 1.9163 4.7043 0 2.7484
##
## Model fitting method was "nlminb".
##
## Likelihood ratio: 9.2853
## P-value (based on X^2): 0.6784
By chance, these data were simulated as follows - and thus without a difference in rate or process between trees:
t1<-pbtree(n=26,tip.label=LETTERS,scale=1)
t2<-pbtree(n=40,scale=2)
t3<-pbtree(n=50,scale=1)
Q<-matrix(c(-2,1,1,1,-2,1,1,1,-2),3,3)
rownames(Q)<-colnames(Q)<-letters[1:3]
x1<-as.factor(sim.history(t1,Q)$states)
x2<-as.factor(sim.history(t2,Q)$states)
x3<-as.factor(sim.history(t3,Q)$states)
Why don't we simulate under conditions in which the process differs between trees and see if we obtain a different result:
t4<-pbtree(n=60,scale=1)
t5<-pbtree(n=70,scale=2)
t6<-pbtree(n=80,scale=1)
Q1<-matrix(c(-1,1,1,-1),2,2)
Q2<-matrix(c(-2,2,2,-2),2,2)
x4<-as.factor(sim.history(t4,Q1)$states)
x5<-as.factor(sim.history(t5,Q1)$states)
x6<-as.factor(sim.history(t6,Q2)$states)
Now we're ready to fit our models:
fitER<-ratebytree(c(t4,t5,t6),list(x4,x5,x6))
fitER
## ML common-rate model:
## q k logL
## value 1.2509 1 -116.4188
##
## Model fitting method was "optimize".
##
## ML multi-rate model:
## q k logL
## tree1 0.8768 3 -112.826
## tree2 0.7286
## tree3 2.7957
##
## Model fitting method was "nlminb".
##
## Likelihood ratio: 7.1856
## P-value (based on X^2): 0.0275
This suggests that a model in which the different trees are permitted to have different rates of evolution fits better than one in which they are all constrained to have equal rates - just as we simulated.
Finally, to get this functionality you should update phytools from GitHub as follows:
library(devtools)
install_github("liamrevell/phytools")
More later.
Holo.
ReplyDeleteNo llevo mucho trabajando con phytools, Actualmente estoy trabajando con un gran número de rasgos morfológicos y quisiera calcular las tasas evolutivas dentro de clados del mismo árbol pero he tenido muchas dificultades para lograrlo, te agradecerÃa mucho cualquier ayuda!!