Over the past few days I’ve written various posts about the pruning algorithm and marginal ancestral state reconstruction for discrete characters (1, 2, 3). I implemented this in a generic method (called ancr
) that computes marginal states based on whatever fitted model object we input!
One of the more powerful things that we might imagine doing with a method like this, however, is model-averaging – so I just added that functionality to the method. I did a similar thing for stochastic mapping last month in which different input models were sampled in proportion to their model weights. In this case, however, we can just model-weight our ancestral state estimates.
Let’s try an example based on pair vs. group spawning modes in bony fishes using a phylogeny & dataset from Benun Sutton & Wilson (2019).
library(phytools)
packageVersion("phytools")
## [1] '1.6.2'
data(bonyfish.tree)
data(bonyfish.data)
spawning_mode<-setNames(bonyfish.data$spawning_mode,
rownames(bonyfish.data))
head(spawning_mode)
## Xenomystus_nigri Chirocentrus_dorab Talismania_bifurcata
## pair group group
## Alepocephalus_tenebrosus Misgurnus_bipartitus Opsariichthys_bidens
## group pair pair
## Levels: group pair
For a binary character like this one, there are really only four possible extended Mk evolutionary models: equal-rates, different-rates, and the two irreversible trait evolution models (pair → group spawning, and the reverse).
Let’s fit each of these.
er_spawning<-fitMk(bonyfish.tree,spawning_mode,
model="ER",pi="fitzjohn")
ard_spawning<-fitMk(bonyfish.tree,spawning_mode,
model="ARD",pi="fitzjohn")
irr1_spawning<-fitMk(bonyfish.tree,spawning_mode,
model=matrix(c(0,0,1,0),2,2),pi="fitzjohn")
irr2_spawning<-fitMk(bonyfish.tree,spawning_mode,
model=matrix(c(0,1,0,0),2,2),pi="fitzjohn")
We can easily plot each of our four fitted models – which is a good step to include, as a minimum to ensure that we fit our four models correctly!
par(mfrow=c(2,2))
mar<-c(0.1,0.1,2.1,0.1)
plot(er_spawning,spacer=0.25,
mar=mar,show.zeros=FALSE,signif=5,
offset=0.04)
mtext("a) ER model",adj=0.1)
plot(irr1_spawning,spacer=0.25,
mar=mar,show.zeros=FALSE,signif=5,
offset=0.04)
mtext("b) group to pair model",adj=0.1)
plot(irr2_spawning,spacer=0.25,
mar=mar,show.zeros=FALSE,signif=5,
offset=0.04)
mtext("c) pair to group model",adj=0.1)
plot(ard_spawning,spacer=0.25,
mar=mar,show.zeros=FALSE,signif=5,
offset=0.04)
mtext("d) ARD model",adj=0.1)
Great! Now let’s compare our fitted models using a generic anova
method – but (when we do) save it to a new object, aov.spawning
.
aov.spawning<-anova(er_spawning,irr1_spawning,
irr2_spawning,ard_spawning)
## log(L) d.f. AIC weight
## er_spawning -29.51819 1 61.03638 0.2955508
## irr1_spawning -30.04038 1 62.08076 0.1753271
## irr2_spawning -29.25299 1 60.50598 0.3853082
## ard_spawning -29.23851 2 62.47702 0.1438138
We can see that the highest weight of evidence falls on the pair → group spawning irreversible model, but the result is quite ambiguous – all models received substantial support.
We’re already ready to compute our model-averaged results as follows.
spawning_anc<-ancr(aov.spawning)
spawning_anc
## Marginal ancestral state estimates:
## group pair
## 91 0.192907 0.807093
## 92 0.197408 0.802592
## 93 0.205489 0.794511
## 94 0.205445 0.794555
## 95 0.978379 0.021621
## 96 0.058689 0.941311
## ...
##
## Log-likelihood = -30.563539
Finally, let’s graph this result on the tree.
sigmoidPhylogram(bonyfish.tree,offset=1,ftype="i",fsize=0.7,
direction="upwards")
cols_spawning<-setNames(viridisLite::viridis(n=2),
levels(spawning_mode))
nodelabels(pie=spawning_anc$ace,piecol=cols_spawning,cex=0.5)
tiplabels(pie=to.matrix(spawning_mode[bonyfish.tree$tip.label],
levels(spawning_mode)),piecol=cols_spawning,cex=0.25)
legend(x=0.95*Ntip(bonyfish.tree),
y=0.2*max(nodeHeights(bonyfish.tree)),
names(cols_spawning),pch=21,pt.bg=cols_spawning,pt.cex=2,
cex=0.9,bty="n")
Our figure shows the model-averaged marginal states – integrating across each of our four Mk models in proportion to the weight of evidence in support of each model. Cool!
Now let’s repeat this whole workflow again – but for a three-state character.
In this case, the number of possible models expands considerably – however, I’ll fit only the ER, SYM (symmetric), and ARD models. (This shouldn’t be interpreted as an endorsement of these specific models – I’ve just done this for the purposes of convenience!)
The data I’m using in this example are diel activity pattern in primates from Kirk & Kay (2004).
## load data
data(primate.tree)
data(primate.data)
diel_pat<-setNames(as.factor(primate.data$Activity_pattern),
rownames(primate.data))
## fit models
er_diel<-fitMk(primate.tree,diel_pat,model="ER",
pi="fitzjohn")
sym_diel<-fitMk(primate.tree,diel_pat,model="SYM",
pi="fitzjohn")
ard_diel<-fitMk(primate.tree,diel_pat,model="ARD",
pi="fitzjohn")
## compare models
aov.diel<-anova(er_diel,sym_diel,ard_diel)
## log(L) d.f. AIC weight
## er_diel -33.42659 1 68.85318 0.47738258
## sym_diel -31.43569 3 68.87137 0.47305890
## ard_diel -30.69175 6 73.38350 0.04955853
Having fit and compared our (in this case) three models, we can go ahead and compute model-weighted marginal ancestral states using ancr
as follows.
diel_anc<-ancr(aov.diel)
diel_anc
## Marginal ancestral state estimates:
## Cathemeral Diurnal Nocturnal
## 91 0.000036 0.018498 0.981466
## 92 0.001103 0.047293 0.951604
## 93 0.002066 0.932075 0.065859
## 94 0.000139 0.998362 0.001499
## 95 0.000020 0.999935 0.000045
## 96 0.000008 0.999983 0.000008
## ...
##
## Log-likelihood = -33.142554
Lastly, let’s plot our results.
## plot results
layout(matrix(c(1,1,1,2,3,4),3,2),
widths=c(0.5,0.5))
plotTree(primate.tree,fsize=0.6,ftype="i",
lwd=1,mar=c(2.1,0.1,2.1,0.1))
cols_diel<-setNames(c("#2297E6","#DAA520","black"),
levels(diel_pat))
legend(x=0.1*max(nodeHeights(primate.tree)),
y=0.1*Ntip(primate.tree),names(cols_diel),pch=21,
pt.bg=cols_diel,bty="n",cex=0.9,pt.cex=2)
nodelabels(pie=diel_anc$ace,cex=0.7,piecol=cols_diel)
mtext("a) model-weighted marginal reconstructions",
adj=0.1,cex=0.9)
par(srt=40)
plot(er_diel,spacer=0.25,color=TRUE,
mar=c(0.1,0.1,2.1,0.1),text=FALSE,asp=0.8,
xlim=c(-1,1),ylim=c(-1,1.2))
mtext(paste("b) ER model (Akaike weight = ",
round(aov.diel$weight[1],2),")"),adj=0.1,cex=0.9)
plot(sym_diel,spacer=0.25,color=TRUE,
mar=c(0.1,0.1,2.1,0.1),text=FALSE,asp=0.8,
xlim=c(-1,1),ylim=c(-1,1.2))
mtext(paste("c) SYM model (Akaike weight = ",
round(aov.diel$weight[2],2),")"),adj=0.1,cex=0.9)
plot(ard_diel,spacer=0.25,color=TRUE,
mar=c(0.1,0.1,2.1,0.1),text=FALSE,asp=0.8,
xlim=c(-1,1),ylim=c(-1,1.2))
mtext(paste("d) ARD model (Akaike weight = ",
round(aov.diel$weight[3],2),")"),adj=0.1,cex=0.9)
par(srt=0)
That’s all there is to it.
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.