I just updated fitPagel
, a phytools function that fit the Pagel
(1994) method for testing whether theh evolutiokn of one binary character
affects a second character (or vice versa).
The way this method works is it simply re-codes the two binary characters
into a single four state character - e.g., 0|0
, 0|1
,
etc. Then it fits two different models. In the first model, the 'independent'
model, the rate of transition in the first character is constrained to be
equal for a given transition type, irregardless of the state for the second
trait. So, for instance, the rate of transition 0|0 -> 1|0
is
the same as 0|1 -> 1|1
because both transitions involve the
same change (0 -> 1
) in character 1. The second model, the
'dependent' model, allows each type of transition to have a different rate.
In other words 0|0 -> 1|0
and 0|1 -> 1|1
can have
different rates (different probabilities of occuring on a given time
interval), even though the exact same type of change is occuring for the
same character in both instances. Obviously, transitions
0|0 -> 1|1
are constrained to have a rate of zero in both
models.
The
update
now permits the phytools function fitMk
to be used internally
to fit the Mk model. Previously the two methods available were
method = "ace"
and method = "fitDiscrete"
, and
these are both still available as options.
Let's compare them:
library(devtools)
install_github("liamrevell/phytools",quiet=TRUE)
library(phytools)
## Loading required package: ape
## Loading required package: maps
We can start by simulating data under the 'dependent' model
## simulate tree
tree<-pbtree(n=300,scale=1)
## simulate data for the two characters
Q<-matrix(c(0,0.4,0.4,0,2,0,0,2,2,0,0,2,0,0.4,0.4,0),4,4,byrow=TRUE)
rownames(Q)<-colnames(Q)<-c("aa","ab","ba","bb")
diag(Q)<--rowSums(Q)
tt<-sim.history(tree,Q)
## Note - the rate of substitution from i->j should be given by Q[j,i].
## Detecting that rows, not columns, of Q sum to zero :
## Transposing Q for internal calculations.
## Done simulation(s).
## split the 'aa','ab','bb','ba' data into two characters
t1<-mergeMappedStates(tt,c("aa","ab"),"a")
t1<-mergeMappedStates(t1,c("ba","bb"),"b")
t2<-mergeMappedStates(tt,c("aa","ba"),"a")
t2<-mergeMappedStates(t2,c("ab","bb"),"b")
t1$states<-getStates(t1,"tips")
t2$states<-getStates(t2,"tips")
## visualize the correlated evolution of the two traits
par(mfrow=c(1,2))
plotSimmap(t1,setNames(c("red","blue"),letters[1:2]),lwd=1,ftype="off")
plotSimmap(t2,setNames(c("red","blue"),letters[1:2]),lwd=1,ftype="off",
direction="leftwards")
## extract the tip data from our simulation
x<-getStates(t1,"tips")
y<-getStates(t2,"tips")
Now we're ready to fit our model using all three methods:
fitPagel(tree,x,y) ## default method is "fitMk"
##
## Pagel's binary character correlation test:
##
## Indepedent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.8726028 0.6514011 0.2212017 0.0000000
## a|b 0.3220958 -0.5432975 0.0000000 0.2212017
## b|a 0.6189176 0.0000000 -1.2703187 0.6514011
## b|b 0.0000000 0.6189176 0.3220958 -0.9410134
##
## Dependent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.5951317 0.3393103 0.2558215 0.0000000
## a|b 4.6486390 -4.7698708 0.0000000 0.1212318
## b|a 2.2086339 0.0000000 -4.5275279 2.3188940
## b|b 0.0000000 0.4113254 0.3105573 -0.7218827
##
## Model fit:
## log-likelihood
## independent -172.8745
## dependent -150.4001
##
## Hypothesis test result:
## likelihood-ratio: 44.94869
## p-value: 4.074835e-09
##
## Model fitting method used was fitMk
fitPagel(tree,x,y,method="ace")
##
## Pagel's binary character correlation test:
##
## Indepedent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.8725883 0.6513844 0.2212039 0.0000000
## a|b 0.3220944 -0.5432984 0.0000000 0.2212039
## b|a 0.6189141 0.0000000 -1.2702985 0.6513844
## b|b 0.0000000 0.6189141 0.3220944 -0.9410086
##
## Dependent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.595134 0.3393112 0.2558228 0.0000000
## a|b 4.648654 -4.7698873 0.0000000 0.1212328
## b|a 2.208634 0.0000000 -4.5275253 2.3188912
## b|b 0.000000 0.4113245 0.3105577 -0.7218823
##
## Model fit:
## log-likelihood
## independent -171.4882
## dependent -149.0138
##
## Hypothesis test result:
## likelihood-ratio: 44.94869
## p-value: 4.074835e-09
##
## Model fitting method used was ace
fitPagel(tree,x,y,method="fitDiscrete")
## Loading required package: geiger
## Loading required package: parallel
## Warning in fitDiscrete(tree, xy, model = iQ): Parameter estimates appear at bounds:
## q14
## q23
## q32
## q41
## Warning in fitDiscrete(tree, xy, model = dQ): Parameter estimates appear at bounds:
## q14
## q23
## q32
## q41
##
## Pagel's binary character correlation test:
##
## Indepedent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.8815200 0.6639046 0.2176153 0.0000000
## a|b 0.3054217 -0.5230370 0.0000000 0.2176153
## b|a 0.6199298 0.0000000 -1.2838344 0.6639046
## b|b 0.0000000 0.6199298 0.3054217 -0.9253515
##
## Dependent model rate matrix:
## a|a a|b b|a b|b
## a|a -0.5943124 0.3350400 0.2592725 0.00000000
## a|b 4.0225133 -4.0854456 0.0000000 0.06293222
## b|a 2.0872453 0.0000000 -4.5241365 2.43689116
## b|b 0.0000000 0.3640895 0.2807990 -0.64488849
##
## Model fit:
## log-likelihood
## independent -171.5709
## dependent -149.5117
##
## Hypothesis test result:
## likelihood-ratio: 44.11832
## p-value: 6.06277e-09
##
## Model fitting method used was fitDiscrete
That's it.
Dear Liam,
ReplyDeleteI try to use fitPagel to check for correlation between two binary traits. However, I always get the following message when I'm using my data with fitMk method:
"model does not have the right number of columns"
Could you explain to me what this message exactly means?
Best
Marcin
Hi Marcin.
DeleteThis is a known bug in the CRAN version. I think it should be fixed if you update phytools from GitHub. To do that start a fresh R session & run:
library(devtools) ## must have devtools installed
install_github("liamrevell/phytools")
Let us know if that takes care of the issue.
- Liam
Thanks a lot. Now everything works perfectly.
DeleteBest
Marcin
Dear Liam,
ReplyDeleteIs it possible to select a substitution model for one variable and other model for the other? (say ER for X and ARD for Y)
Cheers,
Guilherme
Dear Liam,
ReplyDeleteI would like to use "fitPagel" to check for correlation between two binary traits. However, I always get the following message "could not find function "fitPagel"" even though I have the latest R version (3.4.0) and the libraries "phytools", "ape", "geiger", "rgl" installed and loaded.Do you have an idea why I get the error message?
Best
Stefan
I'm not sure why this would be the case. Have you tried restarting your R session?
DeleteDear Liam,
Deletethanks for your fast answer. Surprisingly, I was able to conduct PagelTests if I download phytools from the Danish server but not from German (Göttingen, Münster) servers.
Do you know a possibility to conduct a test with two factorial character of which one is not binary?
Best
Stefan