## Monday, February 19, 2024

### Fitting an ordered Mk model in which the rates of change between adjacent states vary according to a functional form

As readers of this blog know, lately I’ve been working a lot with discrete character evolution models – such as a model for discrete character evolution with $$\Gamma$$ distribution rate heterogeneity among edges, and things like the Pagel ’94 model with intraspecific polymorphism or more than two levels for each trait.

An interesting model that I haven’t yet posted about is one in which the rate of evolutionary change between adjacent levels of an ordered trait varies according to some function form. Note that this is going to be very closely related to the FPK model of Boucher et al. (2018).

I don’t yet have a function to fit this kind of model, but it’s not too difficult to code.

To illustrate this, I’m going to start by simulating data for a discrete character in which the backward and forward transition rates between adjacent character levels are described by two, independent quadratic (i.e., parabolic functions), as follows.

k<-11
x<-0:(k-2)+0.5
q1<-0.1*x^2-x+2.75
q2<--0.1*x^2+x+0.25
par(mar=c(2.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
axes=FALSE,xlab="",ylab="transition rate (q)",
ylim=c(0,max(c(q1,q2))))
lines(x,q2,type="b",col="red")
axis(1,at=0.5:9.5,paste(0:9,"↔",1:10),cex.axis=0.8)
axis(2,las=1,cex.axis=0.8)
grid()
legend("bottomleft",c("forward","backward"),
col=c("blue","red"),
lty="solid",pch=1,cex=0.8)


The symmetry of these two different functions is incidental – we are going to estimate a separate set of parameter values for each direction.

Let’s assemble these backward & forward rates into a Q matrix and then simulate.

library(phytools)

Q<-matrix(0,k,k,dimnames=list(0:(k-1),0:(k-1)))
Q[cbind(1:(k-1),2:k)]<-q1
Q[cbind(2:k,1:(k-1))]<-q2
diag(Q)<--rowSums(Q)
Q

##         0      1      2      3      4      5      6      7      8      9     10
## 0  -2.275  2.275  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
## 1   0.725 -2.200  1.475  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000
## 2   0.000  1.525 -2.400  0.875  0.000  0.000  0.000  0.000  0.000  0.000  0.000
## 3   0.000  0.000  2.125 -2.600  0.475  0.000  0.000  0.000  0.000  0.000  0.000
## 4   0.000  0.000  0.000  2.525 -2.800  0.275  0.000  0.000  0.000  0.000  0.000
## 5   0.000  0.000  0.000  0.000  2.725 -3.000  0.275  0.000  0.000  0.000  0.000
## 6   0.000  0.000  0.000  0.000  0.000  2.725 -3.200  0.475  0.000  0.000  0.000
## 7   0.000  0.000  0.000  0.000  0.000  0.000  2.525 -3.400  0.875  0.000  0.000
## 8   0.000  0.000  0.000  0.000  0.000  0.000  0.000  2.125 -3.600  1.475  0.000
## 9   0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  1.525 -3.800  2.275
## 10  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.000  0.725 -0.725


I’m going to set the ancestral value with the condition "10" because this is expected to be the lowest frequency state at equilibrium.

tree<-pbtree(n=1000)
y<-sim.Mk(tree,Q,anc="10")

## t887 t888 t357 t358 t680 t681 t244 t245 t102 t232 t233  t17  t86 t879 t880 t413 t460 t944 t945 t776
##   10   10    8   10    2    1    2    2    3    6    6   10    3    1    1    2    1    1    1    2
## Levels: 0 1 10 2 3 4 5 6 7 8 9


Our likelihood function is going to need our character vector in binary matrix format, so let’s compute that using phytools::to.matrix.

Y<-to.matrix(y,0:10)

##      0 1 2 3 4 5 6 7 8 9 10
## t887 0 0 0 0 0 0 0 0 0 0  1
## t888 0 0 0 0 0 0 0 0 0 0  1
## t357 0 0 0 0 0 0 0 0 1 0  0
## t358 0 0 0 0 0 0 0 0 0 0  1
## t680 0 0 1 0 0 0 0 0 0 0  0
## t681 0 1 0 0 0 0 0 0 0 0  0
## t244 0 0 1 0 0 0 0 0 0 0  0
## t245 0 0 1 0 0 0 0 0 0 0  0
## t102 0 0 0 1 0 0 0 0 0 0  0
## t232 0 0 0 0 0 0 1 0 0 0  0


Now, here is our likelihood function, in terms – not of Q itself – but of our quadratic function for the elements of Q.

lik<-function(par,pw,X){
k<-ncol(X)
x<-1:(k-1)-0.5
q1<-par[1]*x^2+par[2]*x+par[3]
q2<-par[4]*x^2+par[5]*x+par[6]
q1[q1<0]<-0
q2[q2<0]<-0
if(all(q1<0)||all(q2<0)){
return(Inf)
} else {
MODEL<-matrix(0,k,k,dimnames=list(colnames(X),colnames(X)))
MODEL[cbind(1:(k-1),2:k)]<-1:(k-1)
MODEL[cbind(2:k,1:(k-1))]<-k:(2*k-2)
return(-phytools:::pruning(c(q1,q2),pw,X,model=MODEL))
}
}


(This actually returns the negative log-likelihood, just to make our optimization a minimization rather than maximization problem. This is common.)

pw in the likelihood function is a reordered tree designed to be traversed in a post-order fashion, so let’s get that.

pw<-reorder(tree,"postorder")


Now let’s create some starting values for our function. This is kind of hard because many random values will result in transition rates of Q that are negative, which doesn’t make sense.

As such, I decided to generate a bunch of possible starting values at random until I find some that give me only positive transition rates. I’m not sure that this will be sufficient in general to ensure good starting values for this optimization!

q1_start<-q2_start<--1
while(any(q1_start<0)||any(q2_start<0)){
x<-0:(k-2)+0.5
start<-runif(n=6,)
q1_start<-start[1]*x^2-start[2]*x+start[3]
q2_start<-start[4]*x^2+start[5]*x+start[6]
}
start

## [1] 0.0469336 0.2968316 0.8565286 0.4041830 0.4375652 0.3864706


Now let’s perform our optimization. I’ll use nlminb, but I could’ve used other functions such as optim or optimParallel. By setting control=list(trace=1) we get to monitor the progress of our optimization.

fit<-nlminb(start,lik,pw=pw,X=Y,control=list(trace=1))

##   0:     3602.9981: 0.0469336 0.296832 0.856529 0.404183 0.437565 0.386471
##   1:     2135.8374: 0.978566 0.444600 0.886320 0.0796267 0.376820 0.368815
##   2:     2013.2767:  1.10431 0.315717 0.734891 0.512478 0.459909 0.388391
##   3:     1951.7286: 0.668884 0.147689 0.653551 0.662306 0.515636 0.386829
##   4:     1858.9620: 0.873345 -0.103412 0.471617 0.334222 0.508178 0.320884
##   5:     1717.2040: 0.897862 -1.14454  1.21389 0.395635 0.163189 -1.17634
##   6:     1656.3101: 0.718635 -1.22049  1.17559 0.375971 0.177587 -1.16936
##   7:     1631.8236: 0.757584 -1.29952  1.20848 0.206302 0.167771 -1.21703
##   8:     1573.4346: 0.581777 -1.14596  1.28475 0.186261 0.317080 -0.939541
##   9:     1553.8841: 0.543781 -1.15082  1.28910 0.103940 0.312598 -0.938755
##  10:     1550.5391: 0.475149 -1.15236  1.29867 0.117628 0.319260 -0.936670
##  11:     1537.0724: 0.479373 -1.10430  1.30998 0.0755077 0.323398 -0.908490
##  12:     1528.5904: 0.435137 -1.06228  1.31708 0.0760876 0.335428 -0.874991
##  13:     1516.8876: 0.393778 -0.959396  1.32368 0.0338406 0.352167 -0.799115
##  14:     1505.8326: 0.331454 -0.862403  1.31101 0.0390049 0.369442 -0.719216
##  15:     1496.7773: 0.292116 -0.752826  1.28640 0.00924389 0.366637 -0.647749
##  16:     1490.0800: 0.240565 -0.643499  1.24245 0.0124567 0.340524 -0.593643
##  17:     1486.3323: 0.207646 -0.535789  1.17750 -0.000596915 0.286791 -0.579520
##  18:     1480.3215: 0.189584 -0.432732  1.11605 0.00187114 0.356267 -0.604144
##  19:     1477.7272: 0.172441 -0.333322  1.05415 -0.0141887 0.426896 -0.634247
##  20:     1475.3754: 0.150448 -0.254063 0.970760 -0.00601896 0.366242 -0.686176
##  21:     1474.6698: 0.116291 -0.165807 0.965683 -0.0206969 0.404485 -0.588699
##  22:     1463.7921: 0.124703 -0.0849435 0.969944 -0.0110664 0.388431 -0.473882
##  23:     1463.3282: 0.111451 -0.0889066 0.967769 -0.00359618 0.392782 -0.471279
##  24:     1459.6949: 0.113985 -0.0838559 0.969032 -0.0136911 0.401017 -0.462659
##  25:     1457.7310: 0.107543 -0.0743512 0.971894 -0.0128264 0.423037 -0.440646
##  26:     1456.0828: 0.104744 -0.0488565 0.944355 -0.0226565 0.463253 -0.476850
##  27:     1455.4703: 0.0945570 -0.0460173 0.925933 -0.0228709 0.473353 -0.506842
##  28:     1454.4975: 0.0959298 -0.00430290 0.921772 -0.0251261 0.528937 -0.475900
##  29:     1453.3565: 0.0952626 -0.00478980 0.921437 -0.0302691 0.528792 -0.475679
##  30:     1452.6405: 0.0904115 -0.00619838 0.920684 -0.0299670 0.529676 -0.475080
##  31:     1452.2495: 0.0880577 -0.00224627 0.915796 -0.0341253 0.535456 -0.478754
##  32:     1451.8200: 0.0848770 0.00727603 0.905965 -0.0339079 0.548174 -0.487575
##  33:     1450.3696: 0.0663138 0.119659 0.806714 -0.0488639 0.672443 -0.556205
##  34:     1449.7638: 0.0586181 0.0995974 0.866692 -0.0437133 0.584813 -0.378900
##  35:     1449.5318: 0.0671982 0.0672152 0.885557 -0.0430494 0.582177 -0.322472
##  36:     1449.4067: 0.0600154 0.104100 0.848163 -0.0424568 0.562608 -0.285739
##  37:     1411.2879: 0.0591390 0.107148 0.846960 -0.0390699 0.524176 -0.230576
##  38:     1389.0543: 0.0493611 0.115861 0.844737 -0.0185639 0.380887 -0.00350748
##  39:     1388.5796: 0.0484746 0.113924 0.843809 -0.0317459 0.380952 -0.00245839
##  40:     1383.9654: 0.0455183 0.113012 0.843474 -0.0259091 0.382014 -0.00202301
##  41:     1379.8389: 0.0241335 0.101934 0.838847 -0.0324161 0.388583 0.00382079
##  42:     1377.7328: 0.0306157 0.0856953 0.845019 -0.0350664 0.403551 -0.00826944
##  43:     1347.5331: 0.0973856 -0.405447  1.07862 -0.0164689 0.248845 0.330675
##  44:     1338.6694: 0.114966 -0.548996  1.14335 -0.0285764 0.428899 0.149123
##  45:     1328.9011: 0.0906117 -0.454602  1.12047 -0.0551145 0.701577 0.0746376
##  46:     1317.1492: 0.0733184 -0.470795  1.25698 -0.0946185 0.963662 0.0401832
##  47:     1308.6278: 0.0627722 -0.500774  1.42653 -0.0913227 0.936886 0.0796463
##  48:     1306.8463: 0.0546360 -0.523722  1.59382 -0.100715 0.994967 0.0817817
##  49:     1300.5225: 0.0791695 -0.681724  1.93707 -0.0986108 0.992568 0.0966799
##  50:     1297.4974: 0.0940147 -0.848605  2.27016 -0.0981329 0.986093 0.164148
##  51:     1294.5325: 0.120473 -1.13988  2.97040 -0.0914386 0.913494 0.345255
##  52:     1294.1393: 0.128430 -1.23550  3.16688 -0.0903342 0.894294 0.435355
##  53:     1293.9003: 0.131025 -1.26998  3.27829 -0.0892101 0.876249 0.510064
##  54:     1293.8179: 0.131444 -1.28013  3.31673 -0.0875862 0.854185 0.569067
##  55:     1293.8083: 0.131170 -1.27939  3.31630 -0.0867533 0.843689 0.588803
##  56:     1293.8077: 0.130956 -1.27771  3.31172 -0.0865969 0.841774 0.590599
##  57:     1293.8076: 0.130827 -1.27668  3.30932 -0.0866216 0.841982 0.589668
##  58:     1293.8076: 0.130737 -1.27609  3.30836 -0.0866850 0.842587 0.588756
##  59:     1293.8076: 0.130695 -1.27594  3.30847 -0.0867374 0.843069 0.588388
##  60:     1293.8076: 0.130692 -1.27601  3.30883 -0.0867531 0.843202 0.588437
##  61:     1293.8076: 0.130695 -1.27605  3.30894 -0.0867530 0.843196 0.588496


Now let’s graph our results. Here, I’m going to create a plot that mimics our earlier one, but in which I overlay the fitted functional form on the generating one.

x<-0:(k-2)+0.5
q1_est<-fit$par[1]*x^2+fit$par[2]*x+fit$par[3] q2_est<-fit$par[4]*x^2+fit$par[5]*x+fit$par[6]
par(mar=c(2.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
axes=FALSE,xlab="",ylab="transition rate (q)",
ylim=c(0,max(c(q1,q2,q1_est,q2_est))),lty="dotted")
lines(x,q2,type="b",col="red",lty="dotted")
axis(1,at=0.5:9.5,paste(0:9,"↔",1:10),cex.axis=0.8)
axis(2,las=1,cex.axis=0.8)
grid()
legend("bottomleft",
c("forward","backward","simulated","estimated"),
col=c("blue","red","black","black"),
lty=c("solid","solid","dotted","solid"),
lwd=c(1,1,1,2),
cex=0.8,ncol=2)
lines(x,q1_est,lwd=2,col="blue",type="b")
lines(x,q2_est,lwd=2,col="red",type="b")


Unfortunately it almost never comes out this nicely! Still, pretty cool.

OK – so, just to make sure this wasn’t a by product of bad/good starting values, we can repeat this with our known parameter values, and see that the result is identical.

start<-c(0.1,-1,2.75,-0.1,1,0.25)
fit<-nlminb(start,lik,pw=pw,X=Y,control=list(trace=0))
x<-0:(k-2)+0.5
q1_est<-fit$par[1]*x^2+fit$par[2]*x+fit$par[3] q2_est<-fit$par[4]*x^2+fit$par[5]*x+fit$par[6]
par(mar=c(2.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
axes=FALSE,xlab="",ylab="transition rate (q)",
ylim=c(0,max(c(q1,q2,q1_est,q2_est))),lty="dotted")
lines(x,q2,type="b",col="red",lty="dotted")
axis(1,at=0.5:9.5,paste(0:9,"↔",1:10),cex.axis=0.8)
axis(2,las=1,cex.axis=0.8)
grid()
legend("bottomleft",
c("forward","backward","simulated","estimated"),
col=c("blue","red","black","black"),
lty=c("solid","solid","dotted","solid"),
lwd=c(1,1,1,2),
cex=0.8,ncol=2)
lines(x,q1_est,lwd=2,col="blue",type="b")
lines(x,q2_est,lwd=2,col="red",type="b")


That’s all folks!