Wednesday, December 27, 2023

Fitting a Pagel '94 type model to test the hypothesis that the rate of evolution in one discrete character (with any number of levels) depends on the state of another

Recently, a colleague contacted me about using the phytools function fitmultiMk (described in an ‘in review’ pre-print here) to test a hypothesis that the state of one discrete character influenced the rate of evolution in a second, where the character history of the former is uncertain, rather than known.

Much as I’d like him to use our new method, I had to inform him that I thought a more appropriate analysis for his question would be to set up a “custom” Pagel ’94 - type model. What do I mean by that? Well, the Pagel (1994) approach is most often thought to be a test of correlated binary trait evolution; however, what it actually is is a test about whether the state of trait 1 influences the rate of trait 2 and/or vice versa. Depending on the specific manner in which the traits influence each others evolution, this might lead to an accumulation of certain combinations of character levels over time – in other words, an evolutionary correlation between the two characters.

On the other hand, if all we’re interested in doing (as my colleague indicated he was) is testing whether the condition of trait 1 influenced the back-and-forth rate of character 2, this too can be tested in much the same way as Pagel (1994) describes – just with a slightly different design matrix of our model.

For his benefit, and for anyone else who might be interested, I’m going to review how to do this here. Note that if both our traits are binary in nature, we can also using phytools::fitPagel, which will do us the favor of automating all the following steps.

To start, and to make sure we’re doing what I’m going to claim we’re doing, let’s begin by simulating one discrete character (x), along with a second (y) whose rate of evolution depends on the state of x. To do all this, I’m going to use phytools functions pbtree (to obtain the tree), sim.history (to simulate a discrete character history for x), and sim.multiMk (to simulate our second character, y – the one whose rate is meant to depend on the level of x). I’m going to call the levels of my trait x "slow" and "fast" – though, of course, in practice we needn’t know which levels of x correspond to sedentary or speedy rates of evolution for y!

## load packages
library(phytools)
## simulate 400-taxon tree
tree<-pbtree(n=400,scale=1)
tree
## 
## Phylogenetic tree with 400 tips and 399 internal nodes.
## 
## Tip labels:
##   t4, t5, t21, t315, t316, t67, ...
## 
## Rooted; includes branch lengths.
## simulate history of x
Qx<-matrix(c(-0.5,0.5,0.5,-0.5),2,2,
  dimnames=list(c("slow","fast"),c("slow","fast")))
Qx
##      slow fast
## slow -0.5  0.5
## fast  0.5 -0.5
sim_tree<-sim.history(tree,Qx,anc="slow")
## Done simulation(s).
plot(sim_tree,setNames(palette()[c(2,4)],c("fast","slow")),
  ftype="off",lwd=1)

plot of chunk unnamed-chunk-7

(Remember, this is just the generating history of our rate regimes for y!)

## get tip states of x
x<-getStates(sim_tree,"tips")
head(x)
##     t4     t5    t21   t315   t316    t67 
## "slow" "fast" "slow" "slow" "slow" "slow"
## now simulate second character, y, 
## whose rate depends on x
Qy<-matrix(c(-1,1,0,1,-2,1,0,1,-2),3,3,
  dimnames=list(letters[1:3],letters[1:3]))
QQ<-setNames(list(Qy,10*Qy),c("slow","fast"))
## this is a list of Q matrices for simulation
QQ
## $slow
##    a  b  c
## a -1  1  0
## b  1 -2  1
## c  0  1 -2
## 
## $fast
##     a   b   c
## a -10  10   0
## b  10 -20  10
## c   0  10 -20

We should be able to see here that I’m about to simulate an ordered, three-state character with levels "a", "b", and "c" – that will evolve with a rate of evolution that’s 10 \(\times\) higher on the red edges of our tree, from above, than it does on the blue edges.

y<-sim.multiMk(sim_tree,QQ)
head(y,20)
##   t4   t5  t21 t315 t316  t67 t130 t131 t106  t85 t391 t392  t93 t364 t365  t33   t3 t263 t379 t380 
##    c    b    b    a    a    b    b    a    a    a    a    a    a    b    b    b    b    c    c    c 
## Levels: a b c

Now, of course if we knew the true history of x on the tree we could go ahead and use fitmultiMk (the very thing I started off by saying was not appropriate).

Just for fun, and only because we know the true history in this simulation, let’s do precisely that.

MODEL<-Qy
diag(MODEL)<-0
fitmultiMk(sim_tree,y,model=MODEL)
## Object of class "fitmultiMk".
## 
## Fitted value of Q[fast]:
##           a          b         c
## a -9.326254   9.326254  0.000000
## b  9.326254 -18.652509  9.326254
## c  0.000000   9.326254 -9.326254
## 
## Fitted value of Q[slow]:
##           a         b         c
## a -1.042671  1.042671  0.000000
## b  1.042671 -2.085342  1.042671
## c  0.000000  1.042671 -1.042671
## 
## Fitted (or set) value of pi:
##         a         b         c 
## 0.3333333 0.3333333 0.3333333 
## 
## Log-likelihood: -280.170925 
## 
## Optimization method used was "nlminb"

Most likely, we can see here that fitmultiMk gives very satisfying estimates of our generating values of \(\mathbf{Q}_{fast}\) and \(\mathbf{Q}_{slow}\)!

On the other hand, if we don’t know the real history of x, as would be the case for most realistic datasets, a better approach (IMO) is to use the Pagel ’94 style approach.

To fit this kind of model we first need to create a new composite character of x and y as follows. Though it may not be necessary here, in general it’s a good idea to make sure that our trait factor includes all possible levels (i.e., combinations) of our two traits – not just the combinations that are observed in our data. The easiest way to do that in R is using the function interaction as follows.

xy<-setNames(interaction(y,x),names(x))
head(xy)
##     t4     t5    t21   t315   t316    t67 
## c.slow b.fast b.slow a.slow a.slow b.slow 
## Levels: a.fast b.fast c.fast a.slow b.slow c.slow

Only because I want to, let’s update the levels to a+fast, b+fast, etc. (This just looks nicer.)

levels(xy)<-gsub(".","+",levels(xy),fixed=TRUE)
head(xy)
##     t4     t5    t21   t315   t316    t67 
## c+slow b+fast b+slow a+slow a+slow b+slow 
## Levels: a+fast b+fast c+fast a+slow b+slow c+slow

Finally, we’re ready to build our design matrix for our Pagel-type model.

This is going to look like a normal Mk model design matrix, but in which we only permit changes (say) between "a+fast" \(\leftrightarrow\) "b+fast", etc., and "a+slow" \(\leftrightarrow\) "b+slow", etc., and "a+fast" \(\leftrightarrow\) "a+slow", etc.

According to our hypothesis, each of these types of change, of course, will occur at different rates.

MODEL<-matrix(0,6,6,dimnames=list(levels(xy),levels(xy)))
MODEL
##        a+fast b+fast c+fast a+slow b+slow c+slow
## a+fast      0      0      0      0      0      0
## b+fast      0      0      0      0      0      0
## c+fast      0      0      0      0      0      0
## a+slow      0      0      0      0      0      0
## b+slow      0      0      0      0      0      0
## c+slow      0      0      0      0      0      0
MODEL[1,2]<-MODEL[2,1]<-MODEL[2,3]<-MODEL[3,2]<-1
MODEL
##        a+fast b+fast c+fast a+slow b+slow c+slow
## a+fast      0      1      0      0      0      0
## b+fast      1      0      1      0      0      0
## c+fast      0      1      0      0      0      0
## a+slow      0      0      0      0      0      0
## b+slow      0      0      0      0      0      0
## c+slow      0      0      0      0      0      0
MODEL[4,5]<-MODEL[5,4]<-MODEL[5,6]<-MODEL[6,5]<-2
MODEL
##        a+fast b+fast c+fast a+slow b+slow c+slow
## a+fast      0      1      0      0      0      0
## b+fast      1      0      1      0      0      0
## c+fast      0      1      0      0      0      0
## a+slow      0      0      0      0      2      0
## b+slow      0      0      0      2      0      2
## c+slow      0      0      0      0      2      0
MODEL[1,4]<-MODEL[2,5]<-MODEL[3,6]<-
  MODEL[4,1]<-MODEL[5,2]<-MODEL[6,3]<-3
MODEL
##        a+fast b+fast c+fast a+slow b+slow c+slow
## a+fast      0      1      0      3      0      0
## b+fast      1      0      1      0      3      0
## c+fast      0      1      0      0      0      3
## a+slow      3      0      0      0      2      0
## b+slow      0      3      0      2      0      2
## c+slow      0      0      3      0      2      0

Perfect! I hope that it’s straightforward to see how we might adapt this design matrix for different hypotheses of evolution of our traits, or to more character levels of x or y.

Now let’s go ahead and fit our model!

fit_xy<-fitMk(tree,xy,model=MODEL)
fit_xy
## Object of class "fitMk".
## 
## Fitted (or set) value of Q:
##           a+fast     b+fast    c+fast    a+slow    b+slow    c+slow
## a+fast -9.979769   9.473190  0.000000  0.506579  0.000000  0.000000
## b+fast  9.473190 -19.452959  9.473190  0.000000  0.506579  0.000000
## c+fast  0.000000   9.473190 -9.979769  0.000000  0.000000  0.506579
## a+slow  0.506579   0.000000  0.000000 -1.511920  1.005341  0.000000
## b+slow  0.000000   0.506579  0.000000  1.005341 -2.517261  1.005341
## c+slow  0.000000   0.000000  0.506579  0.000000  1.005341 -1.511920
## 
## Fitted (or set) value of pi:
##   a+fast   b+fast   c+fast   a+slow   b+slow   c+slow 
## 0.166667 0.166667 0.166667 0.166667 0.166667 0.166667 
## due to treating the root prior as (a) flat.
## 
## Log-likelihood: -393.865149 
## 
## Optimization method used was "nlminb"
## 
## R thinks it has found the ML solution.

Just for fun, here’s a cool visualization of our data, tree, & model.

mat<-matrix(c(1,2),1,2)
layout(mat,widths=c(0.4,0.6))
plotTree(tree,ftype="off",lwd=1,xlim=c(0,1.1))
col1<-palette()[c(2,4)]
col2<-viridisLite::viridis(n=3)
xx<-as.factor(x[tree$tip.label])
yy<-as.factor(y[tree$tip.label])
for(i in 1:Ntip(tree)){
  polygon(x=c(1,1.05,1.05,1),y=i+c(-0.5,-0.5,0.5,0.5),
    col=col1[as.numeric(xx)[i]],border="transparent")
  polygon(x=c(1.05,1.1,1.1,1.05),y=i+c(-0.5,-0.5,0.5,0.5),
    col=col2[as.numeric(yy)[i]],border="transparent")
}
plot(fit_xy,width=TRUE,color=TRUE,show.zeros=FALSE,
  mar=rep(0,4),spacer=0.15)

plot of chunk unnamed-chunk-19

Really? Is that all?

How about comparison to a null model in which the rate of y does not depend on x?

Well, that’s easy! Just as for the standard Pagel ’94 model, the likelihood of our null (i.e., independent) model is just the sum of the log-likelihoods of the corresponding Mk models for our two traits.

lik.x<-logLik(fitMk(tree,x,model="ER"))
lik.y<-logLik(fitMk(tree,y,
  model=matrix(c(0,1,0,1,0,1,0,1,0),3,3)))
logL.null<-lik.x+lik.y
attr(logL.null,"df")<-attr(lik.x,"df")+attr(lik.y,"df")
logL.null
## [1] -420.103
## attr(,"df")
## [1] 2
## custom function for likelihood-ratio test
lr.test<-function(lik1,lik2){
  LR<-2*(lik2-lik1)
  as.numeric(
    pchisq(LR,df=attr(lik2,"df")-attr(lik1,"df"),
      lower.tail=FALSE))
}
P<-lr.test(logL.null,logLik(fit_xy))
P
## [1] 4.356051e-13

That’s it!