I just updated the function `fitPagel`

with several new features
and models.

`fitPagel`

fits the Pagel
(1994) model in which
the rate of evolution for one discrete character can be affected by a second,
and *vice versa*.

The extensions in this version are as follows.

(1) I added the argument `dep.var`

which can have values
`"xy"`

, `"x"`

, or `"y"`

and permits
(respectively) the substitution rate of `x`

to depend on
the state of `y`

and *vice versa*, the rate of `x`

(only) to depend on the state of `y`

, and, finally, the rate of
`y`

(only) to dependent on the state of `x`

.

(2) I also added the argument `model`

which can have values
`"ER"`

, `"SYM"`

, or `"ARD"`

. This just sets
the substitution model for the individual characters, regardless of whether
or not the rates depend on a second state. So, for instance, if
`dep.var="x"`

and `model="ER"`

, then `x`

will have two rates, depending on the state of `y`

, but these
two rates will be equal in a forward & backward direction.

Note that `model="ER"`

may not correspond with what we often think
of as a correlated evolution model, because under a correlated evolution
model we might think that if trait 1 acquires state `"b"`

, for
instance, than trait 2 should also acquire state `"b"`

- but under
an equal rates model `"b|a" -> "b|b"`

may occur at a higher rate
than `"a|a" -> "a|b"`

, but that high rate is forced to be equal
to the rate of leaving the correlated states, `"b|b" -> "b|a"`

.

Here is a demo of how using these new options might look:

```
library(phytools)
packageVersion("phytools") ## from GitHub
```

```
## [1] '0.5.32'
```

```
## first, here is our data & tree
colors<-setNames(palette()[1:4],sort(unique(c(x,y))))
dotTree(tree,cbind(x,y),data.type="discrete",fsize=0.7,
colors=colors,pts=FALSE)
```

Now let's fit some models:

```
fit.xy<-fitPagel(tree,x,y)
fit.x<-fitPagel(tree,x,y,dep.var="x")
fit.y<-fitPagel(tree,x,y,dep.var="y")
fit.xy
```

```
##
## Pagel's binary character correlation test:
##
## Assumes "ARD" substitution model for both characters
##
## Independent model rate matrix:
## a|c a|d b|c b|d
## a|c -3.7275431 2.6817189 1.045824 0.000000
## a|d 1.2367301 -2.2825542 0.000000 1.045824
## b|c 0.5811491 0.0000000 -3.262868 2.681719
## b|d 0.0000000 0.5811491 1.236730 -1.817879
##
## Dependent (x & y) model rate matrix:
## a|c a|d b|c b|d
## a|c -5.5939241 5.5939241 0.000000 0.000000
## a|d 0.4590907 -1.8971976 0.000000 1.438107
## b|c 0.6371031 0.0000000 -2.886065 2.248962
## b|d 0.0000000 0.4892436 1.823519 -2.312762
##
## Model fit:
## log-likelihood AIC
## independent -53.76123 115.5225
## dependent -52.44912 120.8982
##
## Hypothesis test result:
## likelihood-ratio: 2.624216
## p-value: 0.6225394
##
## Model fitting method used was fitMk
```

```
fit.x
```

```
##
## Pagel's binary character correlation test:
##
## Assumes "ARD" substitution model for both characters
##
## Independent model rate matrix:
## a|c a|d b|c b|d
## a|c -3.7275431 2.6817189 1.045824 0.000000
## a|d 1.2367301 -2.2825542 0.000000 1.045824
## b|c 0.5811491 0.0000000 -3.262868 2.681719
## b|d 0.0000000 0.5811491 1.236730 -1.817879
##
## Dependent (x only) model rate matrix:
## a|c a|d b|c b|d
## a|c -5.8351846 2.6828060 3.152379 0.0000000
## a|d 1.2044747 -1.9168270 0.000000 0.7123523
## b|c 0.0975676 0.0000000 -2.780374 2.6828060
## b|d 0.0000000 0.8518788 1.204475 -2.0563535
##
## Model fit:
## log-likelihood AIC
## independent -53.76123 115.5225
## dependent -52.86809 117.7362
##
## Hypothesis test result:
## likelihood-ratio: 1.786292
## p-value: 0.4093659
##
## Model fitting method used was fitMk
```

```
fit.y
```

```
##
## Pagel's binary character correlation test:
##
## Assumes "ARD" substitution model for both characters
##
## Independent model rate matrix:
## a|c a|d b|c b|d
## a|c -3.7275431 2.6817189 1.045824 0.000000
## a|d 1.2367301 -2.2825542 0.000000 1.045824
## b|c 0.5811491 0.0000000 -3.262868 2.681719
## b|d 0.0000000 0.5811491 1.236730 -1.817879
##
## Dependent (y only) model rate matrix:
## a|c a|d b|c b|d
## a|c -5.7733459 4.5972311 1.176115 0.000000
## a|d 0.5382396 -1.7143544 0.000000 1.176115
## b|c 0.5655022 0.0000000 -3.120088 2.554586
## b|d 0.0000000 0.5655022 1.780828 -2.346330
##
## Model fit:
## log-likelihood AIC
## independent -53.76123 115.5225
## dependent -52.54412 117.0882
##
## Hypothesis test result:
## likelihood-ratio: 2.434218
## p-value: 0.2960849
##
## Model fitting method used was fitMk
```

There was no correlation simulated in these data - consequently it should be unsurprising (and perhaps reassuring!) that the best fit model is the simplest, independent model.

Since `fitPagel`

is just a wrapper function for other model-fitting
functions, we can change the optimizer easily. One good option is
`fitDiscrete`

in the geiger package:

```
fit.geiger<-fitPagel(tree,x,y,method="fitDiscrete")
```

```
## Warning in fitDiscrete(tree, xy, model = iQ, ...): Parameter estimates appear at bounds:
## q14
## q23
## q32
## q41
```

```
## Warning in fitDiscrete(tree, xy, model = dQ, ...): Parameter estimates appear at bounds:
## q14
## q23
## q32
## q41
```

```
fit.geiger
```

```
##
## Pagel's binary character correlation test:
##
## Assumes "ARD" substitution model for both characters
##
## Independent model rate matrix:
## a|c a|d b|c b|d
## a|c -3.2101072 2.6611828 0.5489243 0.0000000
## a|d 1.2335297 -1.7824540 0.0000000 0.5489243
## b|c 0.5872987 0.0000000 -3.2484816 2.6611828
## b|d 0.0000000 0.5872987 1.2335297 -1.8208284
##
## Dependent (x & y) model rate matrix:
## a|c a|d b|c b|d
## a|c -3.7711178 3.7711178 1.191744e-16 0.0000000
## a|d 0.5702951 -1.2146389 0.000000e+00 0.6443438
## b|c 0.2837940 0.0000000 -3.256106e+00 2.9723119
## b|d 0.0000000 0.7770299 1.739702e+00 -2.5167321
##
## Model fit:
## log-likelihood AIC
## independent -53.27732 114.5546
## dependent -51.99965 119.9993
##
## Hypothesis test result:
## likelihood-ratio: 2.555352
## p-value: 0.6347523
##
## Model fitting method used was fitDiscrete
```

Note that although the parameter estimates & likelihoods should be quite
similar, they won't be precisely the same because `fitDiscrete`

makes
different
default root node prior. The likelihoods should also not be compared.

Finally, we can try the `"ER"`

model as follows, noting the
caveat that I outline above:

```
fit.ER<-fitPagel(tree,x,y,model="ER")
fit.ER
```

```
##
## Pagel's binary character correlation test:
##
## Assumes "ER" substitution model for both characters
##
## Independent model rate matrix:
## a|c a|d b|c b|d
## a|c -1.5175113 0.9151581 0.6023532 0.0000000
## a|d 0.9151581 -1.5175113 0.0000000 0.6023532
## b|c 0.6023532 0.0000000 -1.5175113 0.9151581
## b|d 0.0000000 0.6023532 0.9151581 -1.5175113
##
## Dependent (x & y) model rate matrix:
## a|c a|d b|c b|d
## a|c -0.4990655 0.3757513 0.1233143 0.0000000
## a|d 0.3757513 -1.2105813 0.0000000 0.8348301
## b|c 0.1233143 0.0000000 -1.4990703 1.3757560
## b|d 0.0000000 0.8348301 1.3757560 -2.2105861
##
## Model fit:
## log-likelihood AIC
## independent -54.94916 113.8983
## dependent -53.66275 115.3255
##
## Hypothesis test result:
## likelihood-ratio: 2.572823
## p-value: 0.2762603
##
## Model fitting method used was fitMk
```

Unfortunately, to run `dotTree`

as I have above you may have to
roll-back the internally used package, plotrix. For some reason I've yet to
figure out, the latest plotrix version breaks `dotTree`

(and in
quite a spectacular way). Hopefully I will get to the bottom of it soon.

```
## here is how to install an archived version of plotrix
install.packages(
"https://cran.r-project.org/src/contrib/Archive/plotrix/plotrix_3.6-1.tar.gz",
type="source")
```

The data for this exercise were simulated *without* a correlation,
as follows:

```
tree<-pbtree(n=60,scale=1)
Q<-matrix(c(-1,1,1,-1),2,2)
rownames(Q)<-colnames(Q)<-letters[1:2]
x<-sim.history(tree,Q)$states
rownames(Q)<-colnames(Q)<-letters[3:4]
y<-sim.history(tree,Q)$states
```

Hi Liam, thanks for this update.

ReplyDeleteI followed your instruction and installed plotrix 3.6-1. However, no dots of traits were shown in the plot (the tree was plotted correctly though) when I used dotTree. I did get a few warning messages:

In cos(angles) * radius[circle] + x :

longer object length is not a multiple of shorter object length

I would appreciate a fix to this issue.

Thanks,

Cheng-Chiang Wu

Hope you can do a version of:

ReplyDeletePagel, M., and A. Meade. 2006. Bayesian analysis of correlated evolution of discrete characters by reversible‐jump Markov chain Monte Carlo. The American Naturalist 167: 808–825.

I wonder if stochastic mapping could be adapted to this?

I also got the error that CW got.

ReplyDeleteDoes it handle missing data?

ReplyDelete