A *phytools* user recently contacted me to ask if it was possible to fit the Pagel (1994) correlated binary trait evolution model to data with intraspecific polymorphism (treating the polymorphic condition as intermediate between the two corresponding monomorphic states).

Indeed, this *is* possible and this is what that model looks like when written down on a sheet of paper for characters `A/B`

and `0/1`

:

To see how we might go about fitting this model to data, we can start by loading *phytools*.

```
library(phytools)
```

```
## Loading required package: ape
```

```
## Loading required package: maps
```

Now let’s imagine a tree (our model has a lot of parameters, so I’m imagining myself a *large* tree), and a pair of two data vectors of characters that evolved on this tree.

```
tree
```

```
##
## Phylogenetic tree with 800 tips and 799 internal nodes.
##
## Tip labels:
## t194, t452, t453, t210, t318, t319, ...
##
## Rooted; includes branch lengths.
```

For each character vector, we have a binary trait with polymorphism. Character `x`

has levels `"A"`

, `"B"`

, and the polymorphic condition `"A+B"`

, while character `y`

has levels `"0"`

, `"1"`

, and the polymorphic condition `"0+1"`

.

```
head(x,20)
```

```
## t194 t452 t453 t210 t318 t319 t128 t296 t297 t21 t108 t443 t519 t577 t578 t769 t770 t170 t744 t745
## A A A A A A A A A A A A A A A A A A A A
## Levels: A A+B B
```

```
head(y,20)
```

```
## t194 t452 t453 t210 t318 t319 t128 t296 t297 t21 t108 t443 t519 t577 t578 t769 t770 t170 t744 t745
## 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## Levels: 0 0+1 1
```

Before continuing, let’s create a simple graph of the two traits at the tips of the tree. For each character, I use one color level or the other if the trait is monomorphic for a species, or a split of the two colors if the character is polymorphic. (To learn how to create custom plots like this, I recommend checking out Chapter 13 of my recent book with Luke Harmon.)

```
h<-max(nodeHeights(tree))
plotTree(tree,lwd=1,color="grey",ftype="off",
ylim=h*c(0,1.2),direction="upwards")
pp<-get("last_plot.phylo",envir=.PlotPhyloEnv)
cols_x<-viridisLite::viridis(n=2)
for(i in 1:Ntip(tree)){
if(x[tree$tip.label[i]]=="A")
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+c(0,0.05,0.05,0)*h+0.01*h,border=FALSE,col=cols_x[1])
else if(x[tree$tip.label[i]]=="B")
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+c(0,0.05,0.05,0)*h+0.01*h,border=FALSE,col=cols_x[2])
else {
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+c(0,0.025,0.025,0)*h+0.01*h,border=FALSE,col=cols_x[1])
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5)+0.01*h,
y=pp$yy[i]+c(0.025,0.05,0.05,0.025)*h+0.01*h,
border=FALSE,col=cols_x[2])
}
}
cols_y<-c("#ADD8E6",palette()[2])
for(i in 1:Ntip(tree)){
if(y[tree$tip.label[i]]=="0")
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+0.05*h+c(0,0.05,0.05,0)*h+0.02*h,border=FALSE,col=cols_y[1])
else if(y[tree$tip.label[i]]=="1")
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+0.05*h+c(0,0.05,0.05,0)*h+0.02*h,border=FALSE,col=cols_y[2])
else {
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+0.05*h+c(0,0.025,0.025,0)*h+0.02*h,border=FALSE,col=cols_y[1])
polygon(x=pp$xx[i]+c(-0.5,-0.5,0.5,0.5),
y=pp$yy[i]+0.05*h+c(0.025,0.05,0.05,0.025)*h+0.02*h,
border=FALSE,col=cols_y[2])
}
}
par(lend=2)
legend("topleft",c("A","B","0","1"),col=c(cols_x,cols_y),
lwd=4,horiz=TRUE,bty="n")
text(x=0,y=1.035*h,"x",pos=2,font=3)
text(x=0,y=1.095*h,"y",pos=2,font=3)
```

With our tree & data in hand, I’ll begin by fitting the *independent* evolution model. This is easy because I can just use the *phytools* function `fitpolyMk`

to fit my model for each trait and then add the log-likelihoods to compute the total likelihood under independent evolution of each trait.

```
ind_x<-fitpolyMk(tree,x,model="ARD")
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## A B A+B
## A 0 0 1
## B 0 0 3
## A+B 2 4 0
```

```
ind_x
```

```
## Object of class "fitpolyMk".
##
## Evolution was modeled as 'unordered' using the "ARD" model.
##
## Fitted (or set) value of Q:
## A B A+B
## A -0.036327 0.000000 0.036327
## B 0.000000 -0.061044 0.061044
## A+B 0.308363 0.143621 -0.451984
##
## Fitted (or set) value of pi:
## A B A+B
## 0.333333 0.333333 0.333333
##
## Log-likelihood: -281.998966
##
## Optimization method used was "nlminb"
##
## R thinks it has found the ML solution.
```

```
ind_y<-fitpolyMk(tree,y,model="ARD")
```

```
##
## This is the design matrix of the fitted model.
## Does it make sense?
##
## 0 1 0+1
## 0 0 0 1
## 1 0 0 3
## 0+1 2 4 0
```

```
ind_y
```

```
## Object of class "fitpolyMk".
##
## Evolution was modeled as 'unordered' using the "ARD" model.
##
## Fitted (or set) value of Q:
## 0 1 0+1
## 0 -0.078555 0.000000 0.078555
## 1 0.000000 -0.060303 0.060303
## 0+1 0.169576 0.243399 -0.412976
##
## Fitted (or set) value of pi:
## 0 1 0+1
## 0.333333 0.333333 0.333333
##
## Log-likelihood: -257.539565
##
## Optimization method used was "nlminb"
##
## R thinks it has found the ML solution.
```

The total likelihood under the *independent* evolution model is just…

```
lik_independent<-logLik(ind_x)+logLik(ind_y)
lik_independent
```

```
## [1] -539.5385
## attr(,"df")
## [1] 4
```

…with a total of *eight* estimated parameters.

Here’s where things get complicated! If our trait did not exhibit intraspecific polymorphism, we could proceed to fit our dependent (“correlated”) character evolution (that is, the Pagel ‘94 model) using `phytools::fitPagel`

. Unfortunately, if we want to take into account species polymorphism (and there’s a substantial amount of it in our data) that option is not available to us. We have to fit the model manually!

Our first step will be to create a new trait vector which represents the composite of trait vectors `x`

and `y`

. That is to say, for example, if a species has condition `A+B`

for trait `x`

and `0`

for trait `y`

, it will have condition `A+B|0`

in our composite vector.

This is how we do that.

```
xy<-setNames(interaction(x,y[names(x)],sep="|"),names(x))
head(xy,20)
```

```
## t194 t452 t453 t210 t318 t319 t128 t296 t297 t21 t108 t443 t519 t577 t578 t769 t770 t170 t744 t745
## A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1 A|1
## Levels: A|0 A+B|0 B|0 A|0+1 A+B|0+1 B|0+1 A|1 A+B|1 B|1
```

Next, I’m going to create a *design* matrix for our fitted model. To see how to create design matrices for M*k* models in R, I recommend Chapter 6 of my book – or searching this blog.

In this instance, things get a bit complicated, because rather than fill it in element-by-element, I wanted to iterate over my matrix and figure out which transitions should be allowed and which ones not. For example, we can change from `A+B|1`

to `A|1`

, `B|0`

and `A+B|0+1`

, but not to `A+B|1`

– because the latter requires that we first go through a polymorphic condition for character `y`

. (See hand-drawn plot above!)

```
lsetdiff<-function(x,y){
max(sapply(list(setdiff(x,y),setdiff(y,x)),length))
}
lintersect<-function(x,y) length(intersect(x,y))
levs<-levels(xy)
k<-length(levs)
MODEL<-matrix(0,k,k,dimnames=list(levs,levs))
ind<-1
for(i in 1:k){
s1<-strsplit(
strsplit(rownames(MODEL)[i],split="|",fixed=TRUE)[[1]],
split="+",fixed=TRUE)
for(j in 1:k){
s2<-strsplit(
strsplit(rownames(MODEL)[j],split="|",fixed=TRUE)[[1]],
split="+",fixed=TRUE)
if(sum(mapply(lsetdiff,s1,s2))==1 && all(mapply(lintersect,s1,s2)>0)){
MODEL[i,j]<-ind
ind<-ind+1
}
}
}
```

Here’s the final form of our design matrix.

```
MODEL
```

```
## A|0 A+B|0 B|0 A|0+1 A+B|0+1 B|0+1 A|1 A+B|1 B|1
## A|0 0 1 0 2 0 0 0 0 0
## A+B|0 3 0 4 0 5 0 0 0 0
## B|0 0 6 0 0 0 7 0 0 0
## A|0+1 8 0 0 0 9 0 10 0 0
## A+B|0+1 0 11 0 12 0 13 0 14 0
## B|0+1 0 0 15 0 16 0 0 0 17
## A|1 0 0 0 18 0 0 0 19 0
## A+B|1 0 0 0 0 20 0 21 0 22
## B|1 0 0 0 0 0 23 0 24 0
```

Now, before we continue to fit our model using `phytools::fitMk`

and this design matrix, we would be wise to consider converting our input data to the form of a binary matrix. This is essential if there are any unobserved (but theoretically possible) levels of our combined trait.

```
XY<-to.matrix(xy,levels(xy))
head(XY)
```

```
## A|0 A+B|0 B|0 A|0+1 A+B|0+1 B|0+1 A|1 A+B|1 B|1
## t194 0 0 0 0 0 0 1 0 0
## t452 0 0 0 0 0 0 1 0 0
## t453 0 0 0 0 0 0 1 0 0
## t210 0 0 0 0 0 0 1 0 0
## t318 0 0 0 0 0 0 1 0 0
## t319 0 0 0 0 0 0 1 0 0
```

Finally, we can fit our model. This has a lot of parameters so I’ll using `opt.method="optimParallel"`

to speed up optimization.

```
poly_fit<-fitMk(tree,XY,model=MODEL,opt.method="optimParallel",
logscale=TRUE,rand_start=TRUE)
```

```
## Warning in log(comp[1:M + N]): NaNs produced
```

```
poly_fit
```

```
## Object of class "fitMk".
##
## Fitted (or set) value of Q:
## A|0 A+B|0 B|0 A|0+1 A+B|0+1 B|0+1 A|1 A+B|1 B|1
## A|0 -4.603255 2.713214 0.000000 1.890040 0.000000 0.000000 0.000000 0.000000 0.000000
## A+B|0 0.172895 -0.840878 0.467237 0.000000 0.200745 0.000000 0.000000 0.000000 0.000000
## B|0 0.000000 0.009010 -0.020183 0.000000 0.000000 0.011174 0.000000 0.000000 0.000000
## A|0+1 0.191515 0.000000 0.000000 -0.648068 0.061917 0.000000 0.394636 0.000000 0.000000
## A+B|0+1 0.000000 0.065584 0.000000 0.000014 -0.640738 0.433255 0.000000 0.141885 0.000000
## B|0+1 0.000000 0.000000 0.451230 0.000000 0.063926 -0.602957 0.000000 0.000000 0.087801
## A|1 0.000000 0.000000 0.000000 0.034641 0.000000 0.000000 -0.053804 0.019163 0.000000
## A+B|1 0.000000 0.000000 0.000000 0.000000 0.151977 0.000000 0.446862 -0.630548 0.031710
## B|1 0.000000 0.000000 0.000000 0.000000 0.000000 0.004413 0.000000 1.520881 -1.525295
##
## Fitted (or set) value of pi:
## A|0 A+B|0 B|0 A|0+1 A+B|0+1 B|0+1 A|1 A+B|1 B|1
## 0.111111 0.111111 0.111111 0.111111 0.111111 0.111111 0.111111 0.111111 0.111111
## due to treating the root prior as (a) given.
##
## Log-likelihood: -464.835922
##
## Optimization method used was "optimParallel"
##
## R thinks optimization may not have converged.
```

Let’s compare the independent & dependent models. Just as with a regular Pagel ’94 analysis, the dependent model has the independent model as a special case so should have a higher log-likelihood.

```
lik_ratio<-2*(logLik(poly_fit)-lik_independent)
P<-as.numeric(pchisq(lik_ratio,
df=attr(logLik(poly_fit),"df")-attr(logLik(ind_x),"df")-attr(logLik(ind_y),"df"),
lower.tail=FALSE))
P
```

```
## [1] 1.023475e-23
```

There’s no special plot method for this object – but, hopefully, if we graph the **Q** matrix we should see that the evolutionary process disproportionately points towards `A|1`

and `B|0`

, which is what was simulated via our generating process.

```
plot(poly_fit,width=TRUE,color=TRUE,show.zeros=FALSE,
xlim=c(-2,1),ylim=c(-1,1),spacer=0.15,text=FALSE)
```