In an effort to ensure that my recent book with Luke Harmon is as useful to the R phylogenetics community as possible, I have been trying to publish both corrections and supplements, the latter as obvious extensions or adaptations of the methodologies we present in the book arise – often as a result of reader feedback.
Today, I published a small supplement to Chapter 7
showing how to fit and compare alternative flavors of the Pagel
(1994) correlated binary character evolution model using
fitPagel
of the phytools package. I have also posted this supplement below. All supplements
are linked from the book website. Readers, please continue to
submit your questions, corrections, & comments. Thank you!
In Chapter 7 of the book, Luke and I cover the important correlated binary character evolution model of Pagel (1994). Under this method, we test a hypothesis in which the rate or rates of character evolution (say) of character x can depend on the state of y and/or the rate or rates of character evolution for character y depend on the state of x.
To illustrate the method, we use a published dataset of spawning mode (‘pair’ vs. ‘group’) and male parental care in fishes (from Benun Sutton and Wilson 2019), in which we imagine that parental care should be more likely to evolve with pair spawning (in which paternity is assured) than group spawning.
Let’s take a look at these data, just as we did in the book.
library(phytools)
bonyfish.tree<-read.tree(file="http://www.phytools.org/Rbook/7/bonyfish.tre")
bonyfish.data<-read.csv(file="http://www.phytools.org/Rbook/7/bonyfish.csv",
row.names=1,stringsAsFactors=TRUE)
object<-plotTree.datamatrix(bonyfish.tree,bonyfish.data,
yexp=1,header=FALSE,xexp=1.45,
palettes=c("YlOrRd","PuBuGn"))
leg<-legend(x="topright",names(object$colors$spawning_mode),
cex=0.7,pch=22,pt.bg=object$colors$spawning_mode,
pt.cex=1.5,bty="n",title="spawning mode")
leg<-legend(x=leg$rect$left+4.7,y=leg$rect$top-leg$rect$h,
names(object$colors$paternal_care),cex=0.7,
pch=22,pt.bg=object$colors$paternal_care,pt.cex=1.5,
bty="n",title="paternal care")
In Chapter 7, we focused merely on comparing two alternative hypothesis for the evolution of our two discrete traits on the tree. One (simpler) hypothesis, our null, in which the two characters’ evolution was independent (one trait from the other); and a second (alternative) hypothesis in which spawning mode both depended on paternal care, and vice versa.
We can actually fit a total of four different models: our two aforementioned, plus a model wherein the rates of spawning mode evolution dependent on the state of paternal care (but not the converse), and a model wherein paternal care depends on spawning mode (but not the converse).
In phytools::fitPagel
this is done using the argument dep.var
which can be "xy"
(i.e.,
interdependence, the default), "x"
or "y"
. The "xy"
interdependent model is the most parameter
rich and includes the others as special cases, while both "x"
and "y"
have the independent model
as a special case.
fitPagel
takes two character or factor vectors (or a matrix of probabilities) as input, so let’s
start by extracting our two different characters, spawning mode and paternal care, as factors.
spawning_mode<-setNames(bonyfish.data[,1],
rownames(bonyfish.data))
paternal_care<-setNames(bonyfish.data[,2],
rownames(bonyfish.data))
Now let’s proceed to fit each of the four models as follows.
interdependent<-fitPagel(bonyfish.tree,paternal_care,
spawning_mode)
## Warning in log(comp[1:M + N]): NaNs produced
dependent_care<-fitPagel(bonyfish.tree,paternal_care,
spawning_mode,dep.var="x")
## Warning in log(comp[1:M + N]): NaNs produced
dependent_spawning<-fitPagel(bonyfish.tree,paternal_care,
spawning_mode,dep.var="y")
## Warning in log(comp[1:M + N]): NaNs produced
anova(dependent_care,dependent_spawning,interdependent)
## log(L) d.f. AIC weight
## independent -62.00739 4 132.0148 0.02259014
## dependent_care -57.19616 6 126.3923 0.37568166
## dependent_spawning -57.48214 6 126.9643 0.28224060
## interdependent -55.35818 8 126.7164 0.31948760
When we compare each of the four fitted models using an anova
method, as we’ve done here, we see
that (actually) a model in which the rate of parental care evolution depends on spawning mode (but not
the converse) is best-supported, although only barely, and the weight of evidence is fairly evenly
distributed across each of the three non-independent models.
Let’s now graph our best-supported model, dependent_care
.
plot(dependent_care,cex.main=1,cex.sub=0.8,
cex.traits=0.7,cex.rates=0.7,
lwd.by.rate=TRUE,max.lwd=6)
We might observe that in this fitted model the rate of transition from male parental care and group spawning to no parental care and group spawning is very high – on the order of a million times higher than the rate of transition between other combinations of states. Does this mean anything in particular? Probably not, except that it likely suggests that if a lineage with male parental care were to evolve group spawning, our data and fitted model suggest that parental care would be lost almost instantly – which kind of makes sense and is also consistent with the trait combination of male parental care and group spawning being totally unobserved in our phylogenetic data!
Lastly, if we visualize our interdependent model (thus reiterating Figure 7.2 of the book), we no longer see a vastly inflated transition rate away from the male parental care, group spawning trait combination – probably because we didn’t constrain the rates of change between group and pair spawning to be independent of the state for parental care!
plot(interdependent,cex.main=1,cex.sub=0.8,
cex.traits=0.7,cex.rates=0.7,
lwd.by.rate=TRUE,max.lwd=6)
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.