As I commented in two earlier posts (1, 2), yesterday a new version of phytools (version 0.7-47) became available on CRAN.
Since I've been doing lots of work on phytools, but haven't been keeping up to speed with that work on this
blog, I thought it might be useful to hammer out a few posts outlining some of the updated features of this
new phytools version compared to the previous CRAN release. So far I have commented on updates to
fit.bd
(for fitting a
birth-death model to a reconstructed tree), and in
threshBayes
(for estimating
the evolutionary correlation between traits under a threshold model).
One kind of last-minute update that I made in collaboration with Luke Harmom
via his geiger package (which also has a new CRAN version today)
was the introduction of a new object class ("Qmatrix"
), object conversion method (as.Qmatrix
), and S3
print
and plot
methods. as.Qmatrix
takes a fitted Mk discrete character evolution model (e.g., from
fitMk
or geiger::fitDiscrete
) and then converts it to an object of class "Qmatrix"
which is really just
(you guessed it) a matrix with the class attributed "Qmatrix"
. I also wrote a generic plot
method for the
object class, which now lies inside of the previous plot.fitMk
and plot.gfit
(for geiger) plotting
methods.
Here's a quick demo using the Anolis ecomorph phylogeny & data.
packageVersion("phytools")
## [1] '0.7.47'
packageVersion("geiger")
## [1] '2.0.7'
library(phytools)
library(geiger)
data(anoletree)
anoletree
##
## Phylogenetic tree with 82 tips and 81 internal nodes.
##
## Tip labels:
## ahli, allogus, rubribarbus, imias, sagrei, bremeri, ...
##
## The tree includes a mapped, 6-state discrete character with states:
## CG, GB, TC, TG, Tr, Tw
##
## Rooted; includes branch lengths.
Our tree is a "simmap"
object, but we actually just want a plain old "phylo"
tree, plus the tip states which
we will use as our discrete character.
ecomorph<-getStates(anoletree,"tips")
ecomorph
## ahli allogus rubribarbus imias sagrei
## "TG" "TG" "TG" "TG" "TG"
## bremeri quadriocellifer ophiolepis mestrei jubar
## "TG" "TG" "GB" "TG" "TG"
## homolechis confusus guafe garmani opalinus
## "TG" "TG" "TG" "CG" "TC"
## grahami valencienni lineatopus evermanni stratulus
## "TC" "Tw" "TG" "TC" "TC"
## krugi pulchellus gundlachi poncensis cooki
## "GB" "GB" "TG" "GB" "TG"
## cristatellus brevirostris caudalis marron websteri
## "TG" "Tr" "Tr" "Tr" "Tr"
## distichus alumina semilineatus olssoni insolitus
## "Tr" "GB" "GB" "GB" "Tw"
## whitemani haetianus breslini armouri cybotes
## "TG" "TG" "TG" "TG" "TG"
## shrevei longitibialis strahmi marcanoi baleatus
## "TG" "TG" "TG" "TG" "CG"
## barahonae ricordii cuvieri altitudinalis oporinus
## "CG" "CG" "CG" "TC" "TC"
## isolepis allisoni porcatus loysiana guazuma
## "TC" "TC" "TC" "Tr" "Tw"
## placidus sheplani alayoni angusticeps paternus
## "Tw" "Tw" "Tw" "Tw" "Tw"
## alutaceus inexpectatus clivicola cupeyalensis cyanopleurus
## "GB" "GB" "GB" "GB" "GB"
## alfaroi macilentus vanidicus baracoae noblei
## "GB" "GB" "GB" "CG" "CG"
## smallwoodi luteogularis equestris bahorucoensis dolichocephalus
## "CG" "CG" "CG" "GB" "GB"
## hendersoni darlingtoni aliniger singularis chlorocyanus
## "GB" "Tw" "TC" "TC" "TC"
## coelestinus occultus
## "TC" "Tw"
anoletree<-as.phylo(anoletree)
anoletree
##
## Phylogenetic tree with 82 tips and 81 internal nodes.
##
## Tip labels:
## ahli, allogus, rubribarbus, imias, sagrei, bremeri, ...
##
## Rooted; includes branch lengths.
plotTree(anoletree,offset=3,ftype="i",
fsize=0.8,lwd=1,type="fan")
cols<-setNames(c("green","#E4D96F","darkgreen",
"brown","black","darkgrey"),
c("CG","GB","TC","TG","Tr","Tw"))
tiplabels(pie=to.matrix(ecomorph,
c("CG","GB","TC","TG","Tr","Tw")),
piecol=cols,cex=0.5)
legend("topleft",
c("CG","GB","TC","TG","Tr","Tw"),
pch=21,pt.cex=2,pt.bg=cols)
Now we're pretty much ready to fit our model, which I will do with fitMk
.
fitARD<-fitMk(anoletree,ecomorph,model="ARD")
print(fitARD,digits=4)
## Object of class "fitMk".
##
## Fitted (or set) value of Q:
## CG GB TC TG Tr Tw
## CG 0.0000 0.0000 0.0000 0.0000 0.0000 0.000
## GB 0.0000 0.0000 0.0000 0.0000 0.0000 0.000
## TC 0.0460 0.0000 -0.0768 0.0000 0.0308 0.000
## TG 0.0000 0.0535 0.0000 -0.0535 0.0000 0.000
## Tr 0.0000 0.0000 0.0000 0.0000 0.0000 0.000
## Tw 0.0303 0.0576 0.0876 0.0820 0.0195 -0.277
##
## Fitted (or set) value of pi:
## CG GB TC TG Tr Tw
## 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667 0.1666667
##
## Log-likelihood: -67.3937
##
## Optimization method used was "nlminb"
Now, to pull out the Q matrix let's use as.Qmatrix
:
as.Qmatrix(fitARD)
## Estimated Q matrix:
## CG GB TC TG Tr Tw
## CG 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.000000
## GB 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.000000
## TC 0.04603746 0.00000000 -0.07682257 0.00000000 0.03078511 0.000000
## TG 0.00000000 0.05354003 0.00000000 -0.05354003 0.00000000 0.000000
## Tr 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.000000
## Tw 0.03029880 0.05761600 0.08755621 0.08202421 0.01953679 -0.277032
Finally, let's go ahead & plot this fitted model:
plot(as.Qmatrix(fitARD),show.zeros=FALSE)
title(main="fitMk ARD model\nfor Anolis ecomorph data",
font.main=3,line=-0.5)
Finally, here's a fix that I just thought of & is not in the CRAN phytools version (but may be in a future edition*). [*Note: this update is already on GitHub.]
xy<-plot(as.Qmatrix(fitARD),show.zeros=FALSE)
library(plotrix)
nulo<-mapply(draw.circle,xy$x,xy$y,col=cols,
MoreArgs=list(radius=0.08))
text(xy$x,xy$y,xy$states,cex=1.2,col="white")
legend("topleft",xy$states,
pch=21,pt.cex=2.5,pt.bg=cols)
title(main="fitMk ARD model\nfor Anolis ecomorph data",
font.main=3,line=-0.5)
Ooh. That's cool.
We can do the same with geiger::fitDiscrete
- although we need to keep in mind that fitDiscrete
doesn't fit the same model by default,
since it makes a different assumption about the root
state.
fitARD.geiger<-fitDiscrete(anoletree,ecomorph,
model="ARD",control=list(niter=200))
print(fitARD.geiger,digits=4)
## GEIGER-fitted comparative model of discrete data
## fitted Q matrix:
## CG GB TC TG Tr Tw
## CG -1.918e-15 3.385e-22 1.211e-15 6.072e-16 9.914e-17 1.383e-20
## GB 1.686e-88 -3.008e-16 1.832e-17 2.822e-16 3.483e-19 3.704e-35
## TC 4.603e-02 3.826e-17 -7.682e-02 2.376e-16 3.079e-02 3.992e-17
## TG 2.680e-29 5.354e-02 2.017e-23 -5.354e-02 3.830e-17 3.492e-17
## Tr 5.471e-16 1.104e-31 3.751e-17 4.986e-16 -1.086e-15 2.906e-18
## Tw 3.030e-02 5.762e-02 8.756e-02 8.202e-02 1.954e-02 -2.770e-01
##
## model summary:
## log-likelihood = -65.601973
## AIC = 191.203945
## AICc = 227.674534
## free parameters = 30
##
## Convergence diagnostics:
## optimization iterations = 200
## failed iterations = 0
## number of iterations with same best fit = 4
## frequency of best fit = 0.02
##
## object summary:
## 'lik' -- likelihood function
## 'bnd' -- bounds for likelihood search
## 'res' -- optimization iteration summary
## 'opt' -- maximum likelihood parameter estimates
as.Qmatrix(fitARD.geiger)
## Estimated Q matrix:
## CG GB TC TG Tr Tw
## CG -1.917627e-15 3.385247e-22 1.211323e-15 6.071542e-16 9.913558e-17 1.382745e-20
## GB 1.686377e-88 -3.008306e-16 1.831573e-17 2.821666e-16 3.482602e-19 3.703619e-35
## TC 4.603417e-02 3.826475e-17 -7.682342e-02 2.376160e-16 3.078925e-02 3.992021e-17
## TG 2.680183e-29 5.354024e-02 2.016519e-23 -5.354024e-02 3.829973e-17 3.492387e-17
## Tr 5.470938e-16 1.104037e-31 3.750727e-17 4.985806e-16 -1.086088e-15 2.906186e-18
## Tw 3.029966e-02 5.761593e-02 8.755644e-02 8.202408e-02 1.953551e-02 -2.770316e-01
xy<-plot(as.Qmatrix(fitARD.geiger),
show.zeros=FALSE)
nulo<-mapply(draw.circle,xy$x,xy$y,col=cols,
MoreArgs=list(radius=0.08))
text(xy$x,xy$y,xy$states,cex=1.2,col="white")
legend("topleft",xy$states,
pch=21,pt.cex=2.5,pt.bg=cols)
title(main=paste("fitDiscrete ARD model\n",
"for Anolis ecomorph data"),
font.main=3,line=-0.5)
Even though the two models make different assumptions, their estimates (in this particular case) are almost exactly the same. (We need to be careful, though, because this is not always going to be the true!) We can see that by plotting one against the other:
design<-(lower.tri(as.Qmatrix(fitARD))+
upper.tri(as.Qmatrix(fitARD)))>0 ## this is a bit hacky
plot.default(as.Qmatrix(fitARD)[design],
as.Qmatrix(fitARD.geiger)[design],bty="l",pch=19,
col=make.transparent("blue",0.5),cex=1.5,
asp=1,xlab="fitMk estimates",
ylab="fitDiscrete estimates")
lines(par()$usr[1:2],par()$usr[1:2],lty="dotted")
Cool.
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.