## Wednesday, February 21, 2024

### Fitting an Mk model for ordered discrete character evolution on the tree where the rate of change between adjacent states vary according to an n-degree polynomial functional form

Over the past couple of days I’ve posted (1, 2) about fitting an ordered discrete character evolution (Mk) model in which the transition rates between adjacent states vary one from the other according to a functional form. Last night I added R code for a function (along with some class methods) to fit this model, for the condition in which our functional form is a quadratic polynomial that can be different for our forward & backward transition processes.

I’ve now upgraded & added this model to phytools in the form of the function fitfnMk (see code here).

Rather than merely being limited to simple quadratic functions, fitfnMk can fit nth degree polynomials in which n is arbitrary and can be different for forward and backward transition rates. $$n = 2$$ corresponds to the quadratic functions we saw before, while $$n = 1$$ is a straight line and $$n = 0$$ is a constant. (Likewise, $$n = 3$$ is a cubic polynomial, as we’ll see below.)

OK, let’s try it.

First, load phytools (which will need to be updated from it’s GitHub page in order to follow along).

library(phytools)
packageVersion("phytools")

## [1] '2.1.16'


Now, I’m going to simulate ordered discrete character data on a stochastic phylogeny according to two different 3rd degree polynomial functional forms: one, the forward rates, with high variation (i.e., high amplitude in my cubic polynomial function) in rates among adjacent states; and a second, the backward rates, with low variation. (If you’re curious about where these cubic functions came from, I used this site to generate them in standard form based on a set of four points I wanted on each curve!)

k<-21
x<-0:(k-2)+0.5
q1<-0.0105*x^3-0.31491*x^2+2.43862*x-0.64189
q2<--0.00233*x^3+0.06998*x^2-0.54192*x+3.25375
par(mar=c(5.1,4.1,2.1,1.1))
plot(x,q1,type="b",col="blue",bty="n",las=1,
axes=FALSE,xlab="",ylab="transition rate (q)",
ylim=c(0,max(c(q1,q2))))
lines(x,q2,type="b",col="red")
labs<-mapply(function(x,y) bquote(.(x) %<->% .(y)),
x=0:(k-2),y=1:(k-1))
axis(1,at=seq(0.5,k-1.5,by=1),labels=rep("",k-1))
nulo<-mapply(mtext,text=labs,at=seq(0.5,k-1.5,by=1),
MoreArgs=list(side=1,line=1,las=3,cex=0.7))
axis(2,las=1,cex.axis=0.8)
grid()
legend("bottomleft",c("forward","backward"),
col=c("blue","red"),
lty="solid",pch=1,cex=0.8,title="generating rates")


As in my prior posts, I’ll simulate on a large tree (1,000 tips in this case), just to give us the best chance of recovering our generating process. On the other hand, these models have lower parameter complexity (specifically, $$n_{forward}+n_{backward}+2$$ for our nth-degree polynomial in each trait direction) than do many other Mk model designs. For example, an ordered model with idiosyncratically different substitution rates between adjacent states would have $$2k-2 = 40$$ parameters to be estimated for $$k = 21$$ character levels, compared to 8 for a cubic polynomial functional form in each of the backward & forward directions.

Q<-matrix(0,k,k,dimnames=list(0:(k-1),0:(k-1)))
Q[cbind(1:(k-1),2:k)]<-q1
Q[cbind(2:k,1:(k-1))]<-q2
diag(Q)<--rowSums(Q)
tree<-pbtree(n=1000,scale=10)
y<-sim.Mk(tree,Q,anc=as.character(20))

## t707 t989 t990 t460 t461 t309 t310 t517 t977 t978 t865 t703 t704 t928 t929 t482 t378 t379 t100  t74 t404 t405
##    8    5    5    6    9   13   10   12   10   10    8   10   10   10   10   10   11   11   11    8    9    9
##  t46  t47  t39 t202 t547 t961 t962 t603
##    9   13   11   13   11    9    8    8
## Levels: 10 11 12 13 14 15 16 17 18 19 20 5 6 7 8 9


(I started my simulation with the ancestral condition of trait level 20 because some inspection of the Q matrix told me that this condition had the lowest expected relative frequency at equilibrium under our generating model.)

To start, I’m going to go ahead and fit the generating model – which uses a 3rd degree polynomial in each forward & backward direction. To do that I’ll use fitfnMk and set the argument degree = 3. Note that here I have supplied my character in binary matrix format (using phytools::to.matrix) to ensure that all possible levels (between 0 and $$k = 21$$) that the trait can assume are represented in my fitted model.

fit_deg3<-fitfnMk(tree,to.matrix(y,0:20),trace=0,
model="polynomial",degree=3,start="smart")

## Numerically optimizing simple equal-rates ordered model
##   to get better random starting values....


Cool. Let’s plot our fitted model & overlay the generating process as follows.

par(mar=c(5.1,4.1,2.1,1.1))
plot(fit_deg3)
x<-seq(0.5,(k-2)+0.5,length.out=100)
q1<-0.0105*x^3-0.31491*x^2+2.43862*x-0.64189
q2<--0.00233*x^3+0.06998*x^2-0.54192*x+3.25375
lines(x,q1,col="blue",lty="dotted")
lines(x,q2,col="red",lty="dotted")
legend("bottomright","generating",lty="dotted",cex=0.8)


That’s pretty neat.

For comparison, and also to illustrate how it’s done, let’s consider a 3rd degree (cubic) polynomial for our forward rates, and a 0th degree polynomial (i.e., a constant value) for our backward rates.

For some reason I don’t yet understand, this model tends to get stuck on local optima – so I’m going to try to avoid that by starting it out with good values for our model parameters based on our first, more parameter-rich fitted model, as follows.

start<-c(fit_deg3\$par[1:4],
mean(as.Qmatrix(fit_deg3)[cbind(2:k,1:(k-1))]))
start

## [1]  0.01536061 -0.47942863  4.09660798 -5.13130610  2.03754710

fit_deg3_0<-fitfnMk(tree,to.matrix(y,0:20),trace=1,
model="polynomial",degree=c(3,0),start=start)

##   0:     1955.1073: 0.0153606 -0.479429  4.09661 -5.13131  2.03755
##   1:     1932.0547: 0.0152060 -0.479441  4.09661 -5.13131  2.03755
##   2:     1931.9107: 0.0151942 -0.479441  4.09661 -5.13131  2.03755
##   3:     1931.6715: 0.0152072 -0.479827  4.09653 -5.13132  2.03758
##   4:     1931.1978: 0.0152245 -0.479811  4.09624 -5.13137  2.03784
##   5:     1917.8966: 0.0153249 -0.475968  4.02347 -5.14463  2.10657
##   6:     1897.9140: 0.0150427 -0.471537  4.02840 -5.15449  2.39150
##   7:     1897.6184: 0.0150662 -0.471535  4.02840 -5.15449  2.39150
##   8:     1897.5859: 0.0150613 -0.471533  4.02838 -5.15447  2.39150
##   9:     1897.5834: 0.0150595 -0.471531  4.02836 -5.15446  2.39150
##  10:     1895.8596: 0.0137907 -0.432399  3.71798 -4.91417  2.35956
##  11:     1892.9882: 0.0150594 -0.465541  3.88053 -4.10995  2.66633
##  12:     1892.4594: 0.0148992 -0.456926  3.73718 -3.31105  2.72103
##  13:     1891.3925: 0.0139966 -0.417587  3.18560 -0.732851  2.80694
##  14:     1890.6960: 0.0128802 -0.371870  2.58506  1.83552  2.83813
##  15:     1890.1392: 0.0115832 -0.320794  1.94321  4.39389  2.81774
##  16:     1889.7662: 0.0102876 -0.270886  1.33144  6.71900  2.75470
##  17:     1889.7067: 0.00999686 -0.259920  1.19955  7.18899  2.71919
##  18:     1889.7018: 0.0100780 -0.263227  1.24260  7.00224  2.71232
##  19:     1889.7012: 0.0101390 -0.265739  1.27585  6.85872  2.71089
##  20:     1889.7012: 0.0101460 -0.266052  1.28037  6.83751  2.71076
##  21:     1889.7012: 0.0101458 -0.266053  1.28052  6.83622  2.71069
##  22:     1889.7012: 0.0101456 -0.266046  1.28046  6.83639  2.71067


(trace = 1 told R to print out the progress of our numerical optimization as it ran.)

We can plot this model and overlay the generating conditions as well. The backward rates (in red) should form a straight line, while the forward rates (in blue) of our fitted model should take the form of a cubic polynomial.

par(mar=c(5.1,4.1,2.1,1.1))
plot(fit_deg3_0)
x<-seq(0.5,(k-2)+0.5,length.out=100)
q1<-0.0105*x^3-0.31491*x^2+2.43862*x-0.64189
q2<--0.00233*x^3+0.06998*x^2-0.54192*x+3.25375
lines(x,q1,col="blue",lty="dotted")
lines(x,q2,col="red",lty="dotted")
legend("bottomright","generating",lty="dotted",cex=0.8)


Awesome.

Lastly, comparing between these different models is easy.

The first model we fit (fit_deg3) has the other one as a special case – but consumes three additional parameters. As such we can compare them either using AIC or a likelihood-ratio test.

Let’s do the former using a generic anova call.

anova(fit_deg3_0,fit_deg3)

##               log(L) d.f.      AIC     weight
## fit_deg3_0 -1889.701    5 3789.402 0.91538721
## fit_deg3   -1889.082    8 3794.165 0.08461279


Most likely, this tells us that although our more complex model has a better (less negative) log-likelihood, the simpler model with a cubic polynomial functional form only for the forward (and not the backward) rates of evolution is a better explanation of our data, accounting for parameter complexity.

That’s all folks!