In response to a user request, I just
pushed
an update to anc.ML
to permit the user to fit a so-called "EB"
model of evolution.
The EB model is described as follows in the documentation of geiger's fitContinuous
function:
“EB is the Early-burst model (Harmon et al. 2010) and also called the ACDC model (accelerating-decelerating; Blomberg et al. 2003). Set by the a rate parameter, EB fits a model where the rate of evolution increases or decreases exponentially through time, under the model r[t] = r[0] * exp(a * t), where r[0] is the initial rate, a is the rate change parameter, and t is time.”
My very knowledgeable colleagues Luke Harmon and
Josef Uyeda both assured me that a an EB transformation of the tree
would give me the correct variance-covariance matrix for both the internal nodes and the tips of the
tree. (This is not true, evidently, for the OU model.) Consequently, my initial plan was just
simultaneously optimize σ2, r, and the N - 1 ancestral nodes (for a bifurcating
tree with N tips) by brute force using one of the built-in optimizers in R such as optim
.
This works - but for even a modest-sized tree will result in a very high-dimensional optimization problem & thus quite long computation time.
Just as I was going to bed the other night, however, it occurred to me that this could be done by numerically
optimizing only σ2 and r, and then running the re-rooting algorithm (as
implemented in the phytools function fastAnc
) on a tree transformed by r for that
iteration of the optimization. Note that it would not work to just compute the ML values of
σ2 and r using an external function, such as fitContinuous
, and then
to transform the tree by the optimized value of r because simultaneous optimization of σ2
and r with the ancestral states results in different MLEs of those parameters! (We also see this for
the BM and OU models, although in the former case it doesn't matter.)
Here's a quick demo using some data simulated under a scenario of decreasing rate of evolution through time.
For this example, the tip data are in the vector x
, while the normally unobserved internal node
values are contained in the vector a
.
library(phytools)
packageVersion("phytools")
## [1] '0.6.21'
tree
##
## Phylogenetic tree with 100 tips and 99 internal nodes.
##
## Tip labels:
## t29, t43, t44, t68, t75, t86, ...
##
## Rooted; includes branch lengths.
x
## t29 t43 t44 t68 t75
## 0.438496287 0.472141210 0.508929413 0.476404420 0.500218520
## t86 t87 t56 t57 t21
## 0.442388038 0.412293060 0.325040293 0.375942650 0.420729607
## t32 t33 t71 t72 t18
## 0.144507151 0.074422614 0.188952192 0.203205398 0.156567264
## t9 t30 t48 t49 t37
## 0.228356432 0.201488956 0.150009612 0.084055077 0.151670548
## t23 t24 t7 t5 t88
## 0.120957147 0.276016226 0.289390081 0.286052945 0.285526927
## t89 t1 t58 t59 t84
## 0.269553017 0.338785228 -0.638255514 -0.631750846 -0.655941964
## t85 t16 t65 t66 t95
## -0.640326625 -0.817037017 -0.685259524 -0.663680995 -0.659381902
## t96 t90 t79 t22 t17
## -0.676857787 -0.669947136 -0.673772868 -0.617029153 -0.709658998
## t82 t83 t8 t10 t19
## -0.696291418 -0.688863550 -0.978188152 -0.876536606 -0.744679735
## t20 t4 t50 t51 t67
## -0.782603431 -0.766160613 -0.779978124 -0.763065919 -0.877586463
## t93 t94 t91 t92 t27
## -0.839376372 -0.853901636 -0.636389151 -0.643581270 -0.652135509
## t28 t6 t42 t80 t81
## -0.657921272 -0.964749242 -0.994739494 -0.971873925 -0.970999109
## t41 t52 t53 t62 t78
## -1.047142667 -1.030997961 -1.042163333 -1.037583807 -1.022104291
## t99 t100 t34 t35 t45
## -1.025954239 -1.012997986 -1.047053377 -0.987320158 -1.086046548
## t46 t36 t38 t39 t54
## -1.156241374 -1.120119673 -1.075409445 -1.036747839 -0.663847989
## t55 t97 t98 t47 t31
## -0.698365634 -0.405954828 -0.414290095 -0.383287275 -0.344969991
## t11 t13 t76 t77 t73
## -0.370850161 -1.095050380 -1.037155643 -1.045283871 -0.952923639
## t74 t2 t3 t14 t63
## -1.001848791 0.141608367 0.061594482 0.313000952 0.091721162
## t64 t69 t70 t40 t60
## 0.102012849 0.068840893 0.072838839 0.060826002 0.001699522
## t61 t26 t25 t15 t12
## 0.065996008 0.018760874 0.174547937 0.060346979 0.097391382
a
## 101 102 103 104 105 106
## 0.00000000 0.56474286 0.56260547 0.49095249 0.51083929 0.48733558
## 107 108 109 110 111 112
## 0.48793788 0.45580093 0.33891569 0.34342841 0.20012754 0.30850154
## 113 114 115 116 117 118
## 0.31122528 0.08825530 0.31467638 0.22930927 0.18991152 0.26126403
## 119 120 121 122 123 124
## 0.21824665 0.21867299 0.18964662 0.12701860 0.28838737 0.29755425
## 125 126 127 128 129 130
## 0.32514997 0.24951858 0.28827983 -0.03014248 -0.51204649 -0.79961205
## 131 132 133 134 135 136
## -0.64849732 -0.64521521 -0.65938876 -0.67442981 -0.63613255 -0.71069622
## 137 138 139 140 141 142
## -0.67798100 -0.69087899 -0.66277186 -0.66267207 -0.66167668 -0.90598910
## 143 144 145 146 147 148
## -0.90334119 -0.81676385 -0.82997390 -0.85796450 -0.81789610 -0.72827860
## 149 150 151 152 153 154
## -0.65730144 -0.81978508 -0.81675215 -0.80007763 -0.79561166 -0.84058387
## 155 156 157 158 159 160
## -0.82499333 -0.88804442 -0.85468833 -0.76811047 -0.63836170 -0.63387183
## 161 162 163 164 165 166
## -0.86795308 -0.99065017 -0.97028343 -0.90414426 -1.09017652 -1.06623090
## 167 168 169 170 171 172
## -1.05896253 -1.06176918 -1.04887686 -1.01026847 -0.99830320 -1.01583314
## 173 174 175 176 177 178
## -1.04745049 -1.00458726 -1.10059441 -1.06715470 -1.07423575 -0.65413549
## 179 180 181 182 183 184
## -0.33893314 -0.31792921 -0.39497223 -0.41571693 -0.99822702 -0.97216554
## 185 186 187 188 189 190
## -1.02811719 -0.96863861 0.01942461 0.14562664 0.07154062 0.14167278
## 191 192 193 194 195 196
## 0.06133440 0.06645392 0.09498134 0.10166254 0.07414169 0.11281462
## 197 198 199
## 0.08205368 0.07092746 0.04485251
phenogram(tree,c(x,a),ftype="off",col="lightblue")
title(main="Simulated evolution under the EB model")
fitEB<-anc.ML(tree,x,model="EB")
fitEB
## Ancestral character estimates using anc.ML under a(n) EB model:
## 101 102 103 104 105 106 107
## -0.149645 0.303437 0.456207 0.462999 0.483923 0.463864 0.461351
## 108 109 110 111 112 113 114
## 0.441226 0.269318 0.254532 0.256832 0.341497 0.349491 0.134272
## 115 116 117 118 119 120 121
## 0.243730 0.188746 0.195718 0.237630 0.205890 0.170694 0.146584
## 122 123 124 125 126 127 128
## 0.128694 0.246328 0.247085 0.241900 0.216083 0.277122 -0.169551
## 129 130 131 132 133 134 135
## -0.682232 -0.681542 -0.688835 -0.691968 -0.647491 -0.639588 -0.648064
## 136 137 138 139 140 141 142
## -0.671030 -0.674188 -0.669582 -0.670976 -0.669138 -0.668707 -0.680070
## 143 144 145 146 147 148 149
## -0.786477 -0.812172 -0.799949 -0.794589 -0.800530 -0.764398 -0.694839
## 150 151 152 153 154 155 156
## -0.807828 -0.807780 -0.792406 -0.773237 -0.812107 -0.785678 -0.851186
## 157 158 159 160 161 162 163
## -0.847010 -0.688502 -0.640400 -0.664724 -0.917186 -0.977880 -0.972587
## 164 165 166 167 168 169 170
## -0.811511 -1.049309 -1.042513 -1.039931 -1.054948 -1.054235 -1.032098
## 171 172 173 174 175 176 177
## -1.025175 -1.019691 -1.054352 -1.043612 -1.100283 -1.067161 -1.064348
## 178 179 180 181 182 183 184
## -0.683323 -0.392338 -0.378467 -0.391821 -0.409841 -1.036606 -1.013423
## 185 186 187 188 189 190 191
## -1.037257 -0.983257 0.044576 0.095546 0.121583 0.166058 0.101472
## 192 193 194 195 196 197 198
## 0.106063 0.072185 0.094356 0.088166 0.084205 0.070487 0.052460
## 199
## 0.039961
##
## Fitted model parameters & likelihood:
## sigma^2 r logLik
## 0.449798 -4.169326 423.7552
##
## R thinks it has found the ML solution.
plot(a,fitEB$ace,pch=21,bg="grey",cex=1.5,xlab="true values",
ylab="fitted values (EB)")
lines(range(x),range(x),lwd=2,lty="dotted")
abline(h=0,lwd=1,lty="dotted",col="grey")
abline(v=0,lwd=1,lty="dotted",col="grey")
We can compare these fitted values to those obtained assuming a BM model of evolution, as is most commonly done:
fitBM<-fastAnc(tree,x,CI=TRUE)
fitBM
## Ancestral character estimates using fastAnc:
## 101 102 103 104 105 106 107
## -0.193765 0.186036 0.416272 0.442522 0.475950 0.460722 0.459450
## 108 109 110 111 112 113 114
## 0.440956 0.220870 0.222760 0.244864 0.318102 0.345322 0.146624
## 115 116 117 118 119 120 121
## 0.221311 0.192335 0.195784 0.226760 0.206323 0.174639 0.151006
## 122 123 124 125 126 127 128
## 0.132202 0.234623 0.235591 0.235146 0.218238 0.276331 -0.201955
## 129 130 131 132 133 134 135
## -0.519252 -0.601429 -0.677233 -0.679906 -0.648315 -0.640457 -0.648158
## 136 137 138 139 140 141 142
## -0.668130 -0.673712 -0.667233 -0.670759 -0.669119 -0.668701 -0.624943
## 143 144 145 146 147 148 149
## -0.722262 -0.779573 -0.776116 -0.781598 -0.793646 -0.770032 -0.696552
## 150 151 152 153 154 155 156
## -0.798796 -0.798906 -0.790307 -0.763582 -0.808358 -0.786171 -0.848426
## 157 158 159 160 161 162 163
## -0.846797 -0.704969 -0.641004 -0.675769 -0.867234 -0.968017 -0.970744
## 164 165 166 167 168 169 170
## -0.745804 -1.033322 -1.038690 -1.037851 -1.044465 -1.046229 -1.031451
## 171 172 173 174 175 176 177
## -1.025186 -1.019704 -1.046420 -1.039453 -1.093283 -1.059456 -1.058732
## 178 179 180 181 182 183 184
## -0.684756 -0.425527 -0.392179 -0.395277 -0.409858 -0.962708 -0.998106
## 185 186 187 188 189 190 191
## -1.033858 -0.981408 -0.067529 0.038991 0.096452 0.138378 0.101324
## 192 193 194 195 196 197 198
## 0.091463 0.072296 0.087245 0.084025 0.081671 0.071502 0.054412
## 199
## 0.041498
##
## Lower & upper 95% CIs:
## lower upper
## 101 -0.353689 -0.033841
## 102 0.049784 0.322288
## 103 0.309548 0.522996
## 104 0.351400 0.533644
## 105 0.406334 0.545565
## 106 0.414670 0.506775
## 107 0.419322 0.499579
## 108 0.406877 0.475034
## 109 0.112077 0.329662
## 110 0.122333 0.323186
## 111 0.135702 0.354027
## 112 0.216076 0.420129
## 113 0.282627 0.408017
## 114 0.054781 0.238466
## 115 0.122585 0.320036
## 116 0.079042 0.305628
## 117 0.145089 0.246480
## 118 0.135600 0.317919
## 119 0.105118 0.307528
## 120 0.088770 0.260508
## 121 0.078376 0.223635
## 122 0.069538 0.194867
## 123 0.144605 0.324641
## 124 0.145006 0.326176
## 125 0.137912 0.332381
## 126 0.119169 0.317307
## 127 0.238390 0.314272
## 128 -0.359918 -0.043992
## 129 -0.662893 -0.375611
## 130 -0.727801 -0.475058
## 131 -0.766331 -0.588134
## 132 -0.769390 -0.590422
## 133 -0.717511 -0.579119
## 134 -0.697244 -0.583671
## 135 -0.686688 -0.609627
## 136 -0.750929 -0.585330
## 137 -0.729074 -0.618351
## 138 -0.750354 -0.584111
## 139 -0.715003 -0.626515
## 140 -0.692022 -0.646216
## 141 -0.688743 -0.648658
## 142 -0.746592 -0.503293
## 143 -0.834407 -0.610116
## 144 -0.874649 -0.684496
## 145 -0.868861 -0.683371
## 146 -0.879846 -0.683350
## 147 -0.874696 -0.712595
## 148 -0.861123 -0.678942
## 149 -0.738451 -0.654653
## 150 -0.879227 -0.718366
## 151 -0.879632 -0.718181
## 152 -0.880657 -0.699956
## 153 -0.863390 -0.663774
## 154 -0.883112 -0.733604
## 155 -0.847027 -0.725316
## 156 -0.900784 -0.796068
## 157 -0.869452 -0.824141
## 158 -0.812936 -0.597001
## 159 -0.666963 -0.615045
## 160 -0.773673 -0.577865
## 161 -0.983550 -0.750918
## 162 -1.038762 -0.897272
## 163 -1.013439 -0.928050
## 164 -0.865613 -0.625995
## 165 -1.100397 -0.966246
## 166 -1.100045 -0.977336
## 167 -1.094351 -0.981351
## 168 -1.101313 -0.987617
## 169 -1.101252 -0.991205
## 170 -1.084252 -0.978651
## 171 -1.068248 -0.982123
## 172 -1.033112 -1.006296
## 173 -1.101520 -0.991320
## 174 -1.102232 -0.976674
## 175 -1.156728 -1.029838
## 176 -1.121073 -0.997840
## 177 -1.122105 -0.995359
## 178 -0.750646 -0.618867
## 179 -0.543691 -0.307363
## 180 -0.481934 -0.302425
## 181 -0.463694 -0.326860
## 182 -0.423952 -0.395763
## 183 -1.084062 -0.841354
## 184 -1.076519 -0.919694
## 185 -1.080485 -0.987230
## 186 -1.030321 -0.932496
## 187 -0.233243 0.098185
## 188 -0.120731 0.198713
## 189 -0.003465 0.196369
## 190 0.033319 0.243437
## 191 0.042163 0.160486
## 192 -0.001864 0.184790
## 193 0.018033 0.126560
## 194 0.000148 0.174341
## 195 -0.000961 0.169010
## 196 0.001453 0.161889
## 197 -0.007817 0.150821
## 198 -0.015527 0.124351
## 199 -0.015398 0.098394
plot(a,fitBM$ace,pch=21,bg="grey",cex=1.5,xlab="true values",
ylab="fitted values (BM)")
lines(range(x),range(x),lwd=2,lty="dotted")
abline(h=0,lwd=1,lty="dotted",col="grey")
abline(v=0,lwd=1,lty="dotted",col="grey")
There doesn't seem to be too much of a difference, but if we compare the sum-of-squares difference between the true & estimated values, in general we should find it to be smaller with the EB model:
sum((fitEB$ace-a)^2)
## [1] 0.343143
sum((fitBM$ace-a)^2)
## [1] 0.559287
Let's repeat this a bunch of times to see that it is most often the case:
nrep<-100
ntax<-100
SSdiff<-rep(NA,nrep)
for(i in 1:nrep){
## sim tree
t<-pbtree(n=ntax,scale=1)
## sim data
y<-fastBM(phytools:::ebTree(t,-4),internal=T)
anc<-y[1:t$Nnode+Ntip(t)]
y<-y[1:Ntip(t)]
fitEB<-anc.ML(t,y,model="EB")
fitBM<-fastAnc(t,y,CI=T)
SSdiff[i]<-sum((fitBM$ace-anc)^2)-sum((fitEB$ace-anc)^2)
}
hist(SSdiff,col="grey",main=
"Accuracy of BM - EB (when data simulated under EB)",
xlab="SS(BM) - SS(EB)",breaks=seq(min(SSdiff),max(SSdiff),
by=diff(range(SSdiff)/12)))
This shows that EB is more accurate than BM, when the data have been simulated under an EB model.
Tree & data for the leading example were simulated as follows:
tree<-pbtree(n=100,scale=1)
x<-fastBM(phytools:::ebTree(tree,-4),internal=T)
a<-x[1:tree$Nnode+Ntip(tree)]
x<-x[1:Ntip(tree)]
Would you be able to point me towards any code or package to simulate decreasing evolutionary rate through time for a continuous character?
ReplyDeleteAlso, thank for the tutorial! It's very helpful!