Monday, February 19, 2024

More on fitting an ordered Mk where the forward & backward rates vary according to a functional form

OK, earlier today I posted about fitting an ordered Mk model in which the rate of transitions between adjacent states were allowed to vary according to some function form. I was tweeted excitedly about this, but so far my excitement does not appear to be shared….

It occurred to me that this might be because of the simple & symmetric nature of the functional forms that I used to illustrate the use-cases of this category of model. In an effort to take this one step further, I thought it would be cool to show a totally different evolutionary scenario – one in which the forward rates varied parabolically, but the backwards rates are constant between adjacent states.

k<-13
x<-0:(k-2)+0.5
q1<-0.1*x^2-1.2*x+3.6
q2<-rep(1.5,length(x))
par(mar=c(2.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
  axes=FALSE,xlab="",ylab="transition rate (q)",
  ylim=c(0,max(c(q1,q2))))
lines(x,q2,type="b",col="red")
axis(1,at=seq(0.5,k-1.5,by=1),paste(0:(k-2),"↔",1:(k-1)),
  cex.axis=0.7)
axis(2,las=1,cex.axis=0.8)
grid()
legend("bottomleft",c("forward","backward"),
  col=c("blue","red"),
  lty="solid",pch=1,cex=0.8)

plot of chunk unnamed-chunk-1

Now let’s simulate data under this model, and then fit it – just as we did before.

Keep in mind, I’m still going to fit a quadratic function to my backward & forward rates – however, this has as a special case both a line & a constant, so we’ll see what comes out!

library(phytools)
Q<-matrix(0,k,k,dimnames=list(0:(k-1),0:(k-1)))
Q[cbind(1:(k-1),2:k)]<-q1
Q[cbind(2:k,1:(k-1))]<-q2
diag(Q)<--rowSums(Q)
plot(as.Qmatrix(Q),show.zeros=FALSE,
  width=TRUE,color=TRUE,xlim=c(-2,1),ylim=c(-1,1),
  offset=0.04)

plot of chunk unnamed-chunk-3

Remember, this is my generating model.

Now, let’s simulate our tree and data.

tree<-pbtree(n=1000)
y<-sim.Mk(tree,Q,anc=as.character(k-1))
head(y,20)
## t153 t380 t901 t902 t698 t802 t803 t959 t960 t546 t659 t660 t381 t551 t552 t288  t89 t114 t366 t367 
##    3    1    1    1    2    2    1    3    3    4    3    4    2    2    3    2    1    0    3    0 
## Levels: 0 1 10 11 12 2 3 4 5 6 7 8 9
Y<-to.matrix(y,0:(k-1))
head(Y,10)
##      0 1 2 3 4 5 6 7 8 9 10 11 12
## t153 0 0 0 1 0 0 0 0 0 0  0  0  0
## t380 0 1 0 0 0 0 0 0 0 0  0  0  0
## t901 0 1 0 0 0 0 0 0 0 0  0  0  0
## t902 0 1 0 0 0 0 0 0 0 0  0  0  0
## t698 0 0 1 0 0 0 0 0 0 0  0  0  0
## t802 0 0 1 0 0 0 0 0 0 0  0  0  0
## t803 0 1 0 0 0 0 0 0 0 0  0  0  0
## t959 0 0 0 1 0 0 0 0 0 0  0  0  0
## t960 0 0 0 1 0 0 0 0 0 0  0  0  0
## t546 0 0 0 0 1 0 0 0 0 0  0  0  0

Here’s our likelihood function again.

lik<-function(par,pw,X){
  k<-ncol(X)
  x<-1:(k-1)-0.5
  q1<-par[1]*x^2+par[2]*x+par[3]
  q2<-par[4]*x^2+par[5]*x+par[6]
  q1[q1<0]<-0
  q2[q2<0]<-0
  if(all(q1<0)||all(q2<0)){ 
    return(Inf)
  } else {
    MODEL<-matrix(0,k,k,dimnames=list(colnames(X),colnames(X)))
    MODEL[cbind(1:(k-1),2:k)]<-1:(k-1)
    MODEL[cbind(2:k,1:(k-1))]<-k:(2*k-2)
    return(-phytools:::pruning(c(q1,q2),pw,X,
      model=MODEL))
  }
}

Reorder the tree and generate starting values for optimization.

pw<-reorder(tree,"postorder")
q1_start<-q2_start<--1
while(any(q1_start<0)||any(q2_start<0)){
  x<-0:(k-2)+0.5
  start<-runif(n=6,)
  q1_start<-start[1]*x^2-start[2]*x+start[3]
  q2_start<-start[4]*x^2+start[5]*x+start[6]
}
start
## [1] 0.59691679 0.71808028 0.27507328 0.03471747 0.13922147 0.47962653

Now let’s fit our model.

fit<-nlminb(start,lik,pw=pw,X=Y,control=list(trace=1))
##   0:     2321.6746: 0.596917 0.718080 0.275073 0.0347175 0.139221 0.479627
##   1:     1811.7292: 0.535401 0.712752 0.280821 0.301771 0.160516 0.475521
##   2:     1798.6987: 0.498564 0.708835 0.284986 0.319629 0.160821 0.471241
##   3:     1492.9236: -0.0406121 0.637112 0.387226 -0.0225034 0.0910384 0.353464
##   4:     1482.5660: -0.0323546 0.637803 0.387300 -0.0221383 0.0910951 0.353427
##   5:     1479.6515: -0.0346175 0.633989 0.385230 -0.0176511 0.0960649 0.353569
##   6:     1479.4932: -0.0358049 0.633734 0.385231 -0.0171424 0.0960235 0.353444
##   7:     1479.4056: -0.0350541 0.633062 0.385270 -0.0166829 0.0956473 0.352826
##   8:     1479.3086: -0.0353582 0.632178 0.385333 -0.0166521 0.0951048 0.352067
##   9:     1473.7316: -0.0258145 0.543856 0.393535 -0.00871541 0.0392535 0.271982
##  10:     1437.1064: -0.0208595 0.446100 0.394572 -0.00668892 0.113329 0.222163
##  11:     1433.3834: -0.0193526 0.446178 0.394578 -0.000905327 0.113953 0.222224
##  12:     1430.3337: -0.0164975 0.449382 0.395091 -0.00346836 0.111426 0.224343
##  13:     1429.1284: -0.0163786 0.444420 0.393894 -0.00221319 0.114179 0.223394
##  14:     1428.7256: -0.0140462 0.444357 0.393871 -0.00162670 0.114536 0.223429
##  15:     1428.6553: -0.0142506 0.444267 0.393859 -0.00108033 0.114642 0.223433
##  16:     1428.6137: -0.0140801 0.443810 0.393755 -0.00126249 0.114906 0.223342
##  17:     1418.5788: -0.000354273 0.263787 0.351006 -0.00902655 0.213624 0.182128
##  18:     1408.6210: 0.0195158 -0.00159143 0.607760 -0.0126058 0.223810 0.398582
##  19:     1393.5880: 0.0577282 -0.518446  1.10226 -0.0151095 0.221081 0.869435
##  20:     1388.5653: 0.0720243 -0.718260  1.25934 -0.0187585 0.217703  1.06182
##  21:     1388.3859: 0.0719940 -0.718262  1.25934 -0.0184737 0.217726  1.06183
##  22:     1388.3287: 0.0722791 -0.718236  1.25934 -0.0184512 0.217717  1.06182
##  23:     1388.3177: 0.0723395 -0.718255  1.25918 -0.0182438 0.217462  1.06226
##  24:     1388.2883: 0.0724296 -0.718295  1.25902 -0.0183103 0.217188  1.06273
##  25:     1388.2700: 0.0725812 -0.718392  1.25871 -0.0181349 0.216662  1.06367
##  26:     1388.2492: 0.0726849 -0.718640  1.25809 -0.0181107 0.215607  1.06560
##  27:     1387.7853: 0.0774310 -0.745136  1.20306 -0.0102137 0.120343  1.24502
##  28:     1387.7557: 0.0747168 -0.729611  1.39855 -0.00907155 0.114953  1.18198
##  29:     1387.6558: 0.0739815 -0.729720  1.46949 -0.00869594 0.112532  1.16699
##  30:     1387.5920: 0.0739517 -0.731920  1.45655 -0.00879786 0.111737  1.17312
##  31:     1387.5868: 0.0739554 -0.731922  1.45655 -0.00894254 0.111727  1.17312
##  32:     1387.5760: 0.0739097 -0.731970  1.45665 -0.00888137 0.111714  1.17316
##  33:     1387.5701: 0.0738172 -0.732075  1.45688 -0.00896333 0.111673  1.17324
##  34:     1387.5656: 0.0738071 -0.732274  1.45738 -0.00892292 0.111579  1.17341
##  35:     1386.9866: 0.0819337 -0.834607  1.72243 -0.00455282 0.0596784  1.26491
##  36:     1386.3992: 0.0892125 -0.933860  2.00759 -0.00378855 0.0517799  1.28917
##  37:     1385.8086: 0.0986031 -1.07303  2.50660 -0.00450757 0.0627017  1.29507
##  38:     1385.5896: 0.100599 -1.10739  2.62759 -0.00468324 0.0592590  1.34578
##  39:     1384.0396: 0.109492 -1.25537  3.10407 -0.00467200 0.0366256  1.55782
##  40:     1383.7009: 0.116547 -1.37170  3.47978 -0.00463472 0.0187998  1.72372
##  41:     1383.5976: 0.117518 -1.37327  3.48514 -0.00421234 0.0171793  1.72831
##  42:     1383.5934: 0.117495 -1.37326  3.48515 -0.00416888 0.0171754  1.72831
##  43:     1383.5896: 0.117454 -1.37318  3.48518 -0.00421692 0.0170869  1.72832
##  44:     1383.5854: 0.117438 -1.37317  3.48521 -0.00417143 0.0169657  1.72857
##  45:     1383.5758: 0.117403 -1.37295  3.48533 -0.00420806 0.0165983  1.72893
##  46:     1383.5647: 0.117388 -1.37294  3.48544 -0.00412275 0.0161193  1.72996
##  47:     1383.2069: 0.117594 -1.37274  3.49482 -0.00126609 -0.0208330  1.80948
##  48:     1382.3553: 0.118970 -1.38620  3.59536 0.00803553 -0.142786  2.07791
##  49:     1381.9572: 0.120546 -1.41298  3.72363 0.0101042 -0.170836  2.15138
##  50:     1380.7518: 0.126483 -1.52266  4.23325 0.0122174 -0.201510  2.25230
##  51:     1379.4498: 0.131354 -1.61728  4.66637 0.0106340 -0.182463  2.22203
##  52:     1378.2764: 0.137222 -1.73563  5.25056 0.00867056 -0.159658  2.18444
##  53:     1378.1866: 0.132736 -1.67283  5.05029 0.00646338 -0.125898  2.07562
##  54:     1378.1300: 0.128394 -1.60973  4.84927 0.00432253 -0.0941190  1.96775
##  55:     1378.0768: 0.122706 -1.53244  4.62419 0.00125011 -0.0484574  1.81100
##  56:     1378.0710: 0.123901 -1.54771  4.66952 0.00193594 -0.0576092  1.83464
##  57:     1378.0689: 0.124128 -1.54871  4.66971 0.00218511 -0.0594037  1.82953
##  58:     1378.0680: 0.124541 -1.55394  4.68280 0.00244398 -0.0626585  1.83729
##  59:     1378.0674: 0.124474 -1.55265  4.67638 0.00237246 -0.0613810  1.83216
##  60:     1378.0668: 0.124318 -1.55019  4.66612 0.00213583 -0.0577351  1.81990
##  61:     1378.0667: 0.124295 -1.55003  4.66574 0.00203014 -0.0562787  1.81586
##  62:     1378.0667: 0.124303 -1.55028  4.66706 0.00199484 -0.0558537  1.81501
##  63:     1378.0667: 0.124306 -1.55035  4.66735 0.00199655 -0.0558893  1.81517
##  64:     1378.0667: 0.124307 -1.55035  4.66738 0.00199709 -0.0558978  1.81520

This frequently ends up quite close to our generating functions. Let’s see.

x<-0:(k-2)+0.5
q1_est<-fit$par[1]*x^2+fit$par[2]*x+fit$par[3]
q2_est<-fit$par[4]*x^2+fit$par[5]*x+fit$par[6]
par(mar=c(2.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
  axes=FALSE,xlab="",ylab="transition rate (q)",
  ylim=c(0,max(c(q1,q2,q1_est,q2_est))),lty="dotted")
lines(x,q2,type="b",col="red",lty="dotted")
axis(1,at=seq(0.5,k-1.5,by=1),paste(0:(k-2),"↔",1:(k-1)),
  cex.axis=0.7)
axis(2,las=1,cex.axis=0.8)
grid()
lines(x,q1_est,lwd=2,col="blue",type="b")
lines(x,q2_est,lwd=2,col="red",type="b")
legend("bottomleft",
  c("forward","backward","simulated","estimated"),
  col=c("blue","red","black","black"),
  lty=c("solid","solid","dotted","solid"),
  lwd=c(1,1,1,2),
  cex=0.8,ncol=2)

plot of chunk unnamed-chunk-9

Finally, let’s trick plot.Qmatrix into graphing our fitted transition matrix.

MODEL<-matrix(0,k,k,
  dimnames=list(colnames(Y),colnames(Y)))
MODEL[cbind(1:(k-1),2:k)]<-1:(k-1)
MODEL[cbind(2:k,1:(k-1))]<-k:(2*k-2)
object<-list(index.matrix=MODEL,
  rates=c(q1_est,q2_est),states=colnames(Y))
class(object)<-"fitMk"
plot(object,show.zeros=FALSE,
  width=TRUE,color=TRUE,xlim=c(-2,1),ylim=c(-1,1),
  offset=0.04)

plot of chunk unnamed-chunk-10

Terrific. It worked!

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.