Tuesday, December 13, 2022

Fitting and comparing alternate flavors of Pagel's (1994) correlated binary character evolution model using fitPagel

In an effort to ensure that my recent book with Luke Harmon is as useful to the R phylogenetics community as possible, I have been trying to publish both corrections and supplements, the latter as obvious extensions or adaptations of the methodologies we present in the book arise – often as a result of reader feedback.

Today, I published a small supplement to Chapter 7 showing how to fit and compare alternative flavors of the Pagel (1994) correlated binary character evolution model using fitPagel of the phytools package. I have also posted this supplement below. All supplements are linked from the book website. Readers, please continue to submit your questions, corrections, & comments. Thank you!

In Chapter 7 of the book, Luke and I cover the important correlated binary character evolution model of Pagel (1994). Under this method, we test a hypothesis in which the rate or rates of character evolution (say) of character x can depend on the state of y and/or the rate or rates of character evolution for character y depend on the state of x.

To illustrate the method, we use a published dataset of spawning mode (‘pair’ vs. ‘group’) and male parental care in fishes (from Benun Sutton and Wilson 2019), in which we imagine that parental care should be more likely to evolve with pair spawning (in which paternity is assured) than group spawning.

Let’s take a look at these data, just as we did in the book.

	pt.cex=1.5,bty="n",title="spawning mode")
	bty="n",title="paternal care")
Figure 7.1 from the book, phylogeny for ninety species of bony fishes with the states for two different discrete characters: spawning mode (pair spawning vs. group spawning, in column 1) and paternal care (present or none, in column 2).

In Chapter 7, we focused merely on comparing two alternative hypothesis for the evolution of our two discrete traits on the tree. One (simpler) hypothesis, our null, in which the two characters’ evolution was independent (one trait from the other); and a second (alternative) hypothesis in which spawning mode both depended on paternal care, and vice versa.

We can actually fit a total of four different models: our two aforementioned, plus a model wherein the rates of spawning mode evolution dependent on the state of paternal care (but not the converse), and a model wherein paternal care depends on spawning mode (but not the converse).

In phytools::fitPagel this is done using the argument dep.var which can be "xy" (i.e., interdependence, the default), "x" or "y". The "xy" interdependent model is the most parameter rich and includes the others as special cases, while both "x" and "y" have the independent model as a special case.

fitPagel takes two character or factor vectors (or a matrix of probabilities) as input, so let’s start by extracting our two different characters, spawning mode and paternal care, as factors.


Now let’s proceed to fit each of the four models as follows.

## Warning in log(comp[1:M + N]): NaNs produced
## Warning in log(comp[1:M + N]): NaNs produced
## Warning in log(comp[1:M + N]): NaNs produced
##                       log(L) d.f.      AIC     weight
## independent        -62.00739    4 132.0148 0.02259014
## dependent_care     -57.19616    6 126.3923 0.37568166
## dependent_spawning -57.48214    6 126.9643 0.28224060
## interdependent     -55.35818    8 126.7164 0.31948760

When we compare each of the four fitted models using an anova method, as we’ve done here, we see that (actually) a model in which the rate of parental care evolution depends on spawning mode (but not the converse) is best-supported, although only barely, and the weight of evidence is fairly evenly distributed across each of the three non-independent models.

Let’s now graph our best-supported model, dependent_care.

Fitted correlated binary character evolution model in which the rates of change for trait 1 (parental care) were allowed to depend on the state of trait 2 (spawning mode), but not the converse.

We might observe that in this fitted model the rate of transition from male parental care and group spawning to no parental care and group spawning is very high – on the order of a million times higher than the rate of transition between other combinations of states. Does this mean anything in particular? Probably not, except that it likely suggests that if a lineage with male parental care were to evolve group spawning, our data and fitted model suggest that parental care would be lost almost instantly – which kind of makes sense and is also consistent with the trait combination of male parental care and group spawning being totally unobserved in our phylogenetic data!

Lastly, if we visualize our interdependent model (thus reiterating Figure 7.2 of the book), we no longer see a vastly inflated transition rate away from the male parental care, group spawning trait combination – probably because we didn’t constrain the rates of change between group and pair spawning to be independent of the state for parental care!

Reiteration of Figure 7.2, showing our fitted binary character evolution in which the rate of change for each of the two characters has been permitted to depend on the state of the other.

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.