An undergraduate in Daniel Cadena's
lab here at Uniandes that I have been helping with her senior thesis project
recently pointed me in the direction of the package
MuMIn which includes
in its arsenal the ominously named function
dredge,
which automates the fitting of all possible models for a set of dependent variables.
Unfortunately, though this function works with a variety of different fitted
model types, it does not work with the function PGLS.SEy
(1,
2),
for PGLS with uncertainty in y, which I also wrote in support of this project.
Please note that this post does not consist of an endorsement of
any activity that could be classified as “data dredging” - and I warned
the student of the dangers of such behaviors. However, I nonetheless
assembled a highly simplified version of dredge that she would
be able to apply to a model fit using pgls.SEy (or, in fact,
any similarly structured model-fitting method in R, such as lm).
The function takes as input arguments y and x
which are not data but the name & a vector of the names (respectively) of
the dependent and independent variables of the model.
For instance:
tree ## tree
##
## Phylogenetic tree with 100 tips and 99 internal nodes.
##
## Tip labels:
## t65, t66, t29, t30, t13, t78, ...
##
## Rooted; includes branch lengths.
X ## our data
## ye x1 x2 x3 x4
## t65 -0.90772465 -0.933177258 -0.50289849 -0.833637779 -1.226811847
## t66 -1.17179800 -0.360779073 -0.67343194 -1.232873565 -1.558929304
## t29 4.15357374 -0.004324650 -0.86464732 -1.522271412 0.092652937
## t30 3.55198867 -0.283674892 -0.05950573 -1.158279742 0.783271000
## t13 -1.33541082 -0.260222231 1.69823520 -0.958306299 0.927545322
## t78 -0.56207919 -0.030577008 -0.12238217 -0.581989880 -0.789061617
## t79 -1.70823856 -0.242023433 0.32537283 -0.744757917 -0.830916154
## t5 3.54366591 -0.351473922 -0.86081240 -1.800558381 -0.146559729
## t26 -3.34799011 -0.417929491 -0.87860040 -0.964327081 -1.780002063
## t67 -3.58452023 -0.494481915 -1.04212209 -1.041371912 -1.810841912
## t68 -3.32209074 -0.588127345 -0.71373910 -0.700753022 -1.315598157
## t86 -2.54198007 -0.045597241 -1.28701218 -0.870836065 -1.752840010
## t87 -2.39865344 -0.284624727 -1.19177925 -0.413622945 -1.571490363
## t25 -1.57609619 -0.099326903 -1.26492190 -0.185306209 -1.449849790
## t16 -5.12139002 -0.329391279 -0.69189543 -0.731627112 -2.294395131
## t47 -5.88290498 -0.998675854 -0.60213264 -1.864536626 -2.411244328
## t48 -5.58564307 -1.135659972 -0.90384083 -2.177917526 -2.317161349
## t10 5.93972192 -0.170786099 -0.84404497 0.258286333 0.254189627
## t20 0.11366308 -0.116176393 -1.43439903 -1.008153379 -0.756823697
## t63 3.94588996 -0.271788410 -0.48803237 -0.940323863 0.501302246
## t64 3.21911945 -0.122474607 -0.87832126 -0.254448825 0.274372750
## t53 4.74552509 -0.188851979 -0.48504631 -0.201398930 0.257906423
## t21 0.75183425 -0.666761834 -1.27216994 -0.963997566 -0.511842190
## t23 5.96488228 -0.263860837 -1.16042313 -1.542287147 0.514510167
## t49 4.17655995 -0.143231501 -1.19002853 -1.450748764 0.507057795
## t50 4.80439077 0.258295208 -0.60080140 -1.075878405 0.781512173
## t80 2.32980803 -0.326915114 0.10754248 -0.923439256 0.643239607
## t81 4.08965563 -0.507894799 0.26522058 -0.469615756 1.120269519
## t19 -1.48944997 -0.811681993 -0.26318414 -2.777328413 -0.603814640
## t84 -2.91205159 -0.737986864 -1.02763382 -1.957564180 -1.353937182
## t85 -2.06165298 -0.712068874 -0.77728970 -2.245292884 -1.401371778
## t40 -3.02205037 -0.933452671 -0.83696879 -2.547047370 -1.476694576
## t41 -4.87730189 -0.229120475 -1.49848945 -2.765038158 -2.032890033
## t18 -0.52165070 -0.204012732 -1.64088635 -2.344461535 -1.124866746
## t76 -1.65255423 -0.117917709 -0.77909630 -2.544663932 -1.095495017
## t77 0.07018189 0.287088622 -0.72342580 -2.680871209 -0.986772078
## t27 0.92637845 -0.358620911 -1.71171019 -1.988283823 -0.648392578
## t92 3.12710215 -0.337589308 -1.45006236 -2.639934958 0.017958145
## t93 3.65860534 -0.312388755 -1.60509872 -2.595797518 0.158134389
## t8 -2.36028047 -0.676186538 -0.41457179 -1.775466733 -0.527343477
## t2 5.66245601 -1.382336906 -0.60476399 -2.774459210 0.987429394
## t95 2.16726940 0.071824226 -0.04136211 -0.978944992 0.702904379
## t96 5.23800963 -0.000369423 0.04677993 -1.153812002 0.845994491
## t6 -3.08927960 -1.228130839 1.24440860 -1.666626516 -0.183465598
## t44 -0.32675842 -0.778840190 -0.10432506 -0.930466508 0.132912688
## t51 0.71141037 -1.529691824 -0.63523463 -1.351234148 -0.203513384
## t52 -2.97932839 -0.671563524 0.25110920 -0.940436026 -0.144505347
## t72 -0.56710045 -0.208175485 0.21888791 -0.683621798 -0.439724856
## t73 3.06125726 -0.280504466 -0.05425123 -0.977418875 -0.399349006
## t35 0.88640748 -0.432133697 -0.60337432 -1.217613428 -0.161792920
## t38 2.22799237 -1.599070451 -1.12646600 -0.825638529 0.019129341
## t39 0.80698662 -1.019397388 -0.14879313 -0.727519051 0.100882723
## t99 -0.56690640 0.819938089 0.10457809 0.343316393 -0.571194773
## t100 0.31251389 0.797765277 0.09518138 0.337884832 -0.567458375
## t24 -5.01149282 0.132350015 1.25951182 -0.438835754 -0.889606534
## t11 -1.46762699 -0.338010923 0.63895729 -0.717547221 0.274369423
## t36 -0.79179712 -0.109222940 -0.84517257 -0.001648664 -1.305286979
## t37 0.67623873 -0.783798640 0.25056370 -0.099499147 0.026504819
## t28 -3.14350681 -0.624356825 -0.18617947 -1.046835231 -1.207608991
## t7 1.79371239 -0.995914763 -0.82088749 0.114602663 0.108922281
## t12 3.06529461 -0.430696448 -0.37673980 -0.904791446 0.664551613
## t33 1.01319467 -0.459191936 0.14792529 -0.647299464 0.429768234
## t34 -1.83383112 -1.369205857 0.15734648 -1.133452917 -0.185282740
## t15 -5.28474388 -0.668597155 -0.19397849 -1.378048152 -1.394404802
## t42 0.91464728 -1.006046065 -0.59102696 -2.544077955 -0.510802841
## t43 0.17074896 -0.657881494 -0.76035426 -1.710494052 -0.432355897
## t22 -0.93089794 -0.549687736 -0.49300322 -2.316250811 -0.798517510
## t1 1.37181061 -0.735847227 -0.10188643 -1.482318015 -0.069082305
## t3 5.83721322 -1.255254006 -1.36243767 -0.910301622 0.679504522
## t4 2.77524322 0.036571658 0.12260330 -1.806011253 0.233933022
## t74 2.32812021 1.178218512 -0.76133067 -0.509128250 -0.997411930
## t75 2.74291448 1.064090251 -0.95706527 -0.457299846 -0.867581954
## t31 -2.28596276 -0.435236571 0.54331428 -0.959954296 -0.098098866
## t59 -4.96441837 -0.567740108 0.91285849 -0.720905297 -0.427382552
## t60 -5.65604410 -0.516435916 0.64042685 -1.284172323 -0.730787277
## t9 7.14934917 1.375046920 -1.24425619 0.495073740 0.695899208
## t90 7.20415861 0.878034624 -2.01772447 0.450435820 0.664090248
## t91 5.67348238 0.935005458 -2.09172115 0.296478696 0.295762185
## t56 0.67333579 -0.517446554 0.30596671 -0.270272009 0.447548743
## t57 -1.04853187 -0.116159889 0.05991413 0.112933000 0.284765732
## t14 -3.77856865 -0.906819465 0.46607917 -0.328972208 -0.508998125
## t61 -3.93022454 -1.312009985 1.32878564 -0.562407187 0.007788355
## t62 -5.05415778 -1.612880369 1.08221968 -0.620768376 -0.315525473
## t69 -2.95408172 -0.242584724 0.76102366 -0.318980789 -0.494282321
## t70 -4.90426049 -0.396631201 1.07417172 -0.504832758 -0.631509904
## t88 -0.79912057 0.237257899 0.41246936 -0.726583288 -0.117175979
## t89 1.19199363 0.434875538 0.30737619 -0.326529423 -0.204683678
## t54 -2.81800403 -0.569749931 1.07419352 0.071561418 -0.288446193
## t55 -1.19879402 -0.152194258 1.08008287 -0.016827783 0.387440401
## t32 -1.75551359 0.602680886 0.75156851 -0.389637248 -0.359414984
## t45 -1.11541051 -1.438964529 2.01395563 0.804516332 0.786960566
## t46 -0.24158566 -1.404376125 1.86329858 0.837682607 1.315479613
## t17 3.18356839 -0.217652443 0.32900077 1.144934996 1.249666779
## t58 -5.15046378 -1.498640558 1.17233451 0.384922560 -0.425217811
## t82 -3.68968208 -1.348265012 1.08151593 0.523997757 0.152053682
## t83 -4.32950798 -1.381394718 1.22167672 0.207916476 -0.146115788
## t71 -4.28308869 -1.170708866 1.02083131 0.157810886 -0.164189066
## t94 -4.87114380 -1.200085947 1.03456095 0.172964350 -0.358510350
## t97 -5.38291471 -1.213089703 1.18276260 0.103353295 -0.596856961
## t98 -5.25643062 -1.402582088 1.24727944 0.018381114 -0.525440524
SE ## our standard errors in y
## t65 t66 t29 t30 t13 t78 t79
## 0.2838481 1.1192735 0.5848418 1.5342279 1.4452484 1.9503550 0.8550472
## t5 t26 t67 t68 t86 t87 t25
## 0.4786313 0.2227377 0.4710501 0.4939850 0.5502124 0.8233990 0.3534902
## t16 t47 t48 t10 t20 t63 t64
## 0.4161823 0.7945718 0.7589094 1.7952263 1.0058047 1.1459235 0.3266516
## t53 t21 t23 t49 t50 t80 t81
## 1.2841140 0.6995356 1.2588015 0.4612205 0.1606824 1.6941166 1.0015120
## t19 t84 t85 t40 t41 t18 t76
## 1.0403561 0.3647199 1.2730340 0.9148257 1.3012779 0.2001514 0.7520138
## t77 t27 t92 t93 t8 t2 t95
## 1.2245456 0.1813785 0.5115684 1.1474370 0.9674241 1.6606047 0.8268985
## t96 t6 t44 t51 t52 t72 t73
## 1.8817409 1.4248163 0.8072912 0.1955722 1.8251670 0.4031782 1.3043864
## t35 t38 t39 t99 t100 t24 t11
## 1.2561020 0.8432288 0.7994611 0.8215323 1.5471417 0.9000843 0.5142108
## t36 t37 t28 t7 t12 t33 t34
## 0.7591903 0.5203935 0.1893714 1.0115815 0.9492312 0.5408957 0.5958648
## t15 t42 t43 t22 t1 t3 t4
## 1.2035645 0.9897115 0.8888153 2.5470790 0.3322280 0.8961755 0.5924993
## t74 t75 t31 t59 t60 t9 t90
## 0.7070748 0.5629469 0.5976459 0.5577206 1.4663451 1.4102296 0.9487273
## t91 t56 t57 t14 t61 t62 t69
## 0.6127336 0.9353615 1.1832985 0.9536276 0.5271702 1.1272450 1.2397424
## t70 t88 t89 t54 t55 t32 t45
## 0.9483944 0.4326344 0.9381514 0.7614915 0.8734860 0.3894159 0.6516323
## t46 t17 t58 t82 t83 t71 t94
## 0.7583656 1.5420088 0.2704494 1.0767461 0.1762237 1.0317629 1.1824717
## t97 t98
## 1.4317323 0.9749933
## our variable names
y<-"ye"
x<-c("x1","x2","x3","x4")
## the function
DREDGE<-function(y,x,fn=pgls.SEy,...){
n<-length(x)
ii<-list()
for(i in 1:n) ii<-c(ii,combn(x,i,simplify=FALSE))
fits<-list()
fits[[1]]<-fn(as.formula(paste(y,"~1")),...)
for(i in 1:length(ii)){
model<-as.formula(paste(y,"~",paste(ii[[i]],collapse="+")))
fits[[i+1]]<-fn(model,...)
}
fits
}
Now, let's run it. I will fit both a Brownian & Pagel's λ model for the structure of the residual error, and then combine both lists of fitted models:
## Brownian error
fits.bm<-DREDGE(y,x,data=X,se=SE,tree=tree,method="ML")
## the first three of these (there are 16)
fits.bm[1:3]
## [[1]]
## Generalized least squares fit by maximum likelihood
## Model: model
## Data: cbind(data, vf)
## Log-likelihood: -223.3477
##
## Coefficients:
## (Intercept)
## 0.1609445
##
## Correlation Structure: corBrownian
## Formula: ~1
## Parameter estimate(s):
## numeric(0)
## Variance function:
## Structure: fixed weights
## Formula: ~vf
## Degrees of freedom: 100 total; 99 residual
## Residual standard error: 0.9168844
##
## [[2]]
## Generalized least squares fit by maximum likelihood
## Model: model
## Data: cbind(data, vf)
## Log-likelihood: -220.0688
##
## Coefficients:
## (Intercept) x1
## 0.5595618 1.1912291
##
## Correlation Structure: corBrownian
## Formula: ~1
## Parameter estimate(s):
## numeric(0)
## Variance function:
## Structure: fixed weights
## Formula: ~vf
## Degrees of freedom: 100 total; 98 residual
## Residual standard error: 0.9339633
##
## [[3]]
## Generalized least squares fit by maximum likelihood
## Model: model
## Data: cbind(data, vf)
## Log-likelihood: -213.2678
##
## Coefficients:
## (Intercept) x2
## -0.07352865 -1.77071022
##
## Correlation Structure: corBrownian
## Formula: ~1
## Parameter estimate(s):
## numeric(0)
## Variance function:
## Structure: fixed weights
## Formula: ~vf
## Degrees of freedom: 100 total; 98 residual
## Residual standard error: 0.9507248
## lambda error
fits.lambda<-DREDGE(y,x,data=X,corClass=corPagel,tree=tree,method="ML")
fits<-c(fits.bm,fits.lambda)
## AIC values for each model
aic<-sapply(fits,AIC)
aic
## [1] 450.6955 446.1376 432.5356 450.2586 394.5708 430.4351 447.0244
## [8] 379.7736 428.2995 314.1018 396.3431 428.0877 291.3430 378.9618
## [15] 315.3299 293.1996 460.7735 456.1583 443.1340 460.3964 410.5457
## [22] 441.2863 457.0599 395.3290 439.3130 335.7291 412.4450 439.2609
## [29] 316.4621 395.0639 336.6753 318.3983
Finally, here is our best-fitting model based on AIC:
fits[aic==min(aic)][[1]]
## Generalized least squares fit by maximum likelihood
## Model: model
## Data: cbind(data, vf)
## Log-likelihood: -140.6715
##
## Coefficients:
## (Intercept) x1 x2 x4
## 0.6896117 0.9263428 -2.0679544 3.2012811
##
## Correlation Structure: corBrownian
## Formula: ~1
## Parameter estimate(s):
## numeric(0)
## Variance function:
## Structure: fixed weights
## Formula: ~vf
## Degrees of freedom: 100 total; 96 residual
## Residual standard error: 0.8682554
Not bad, because the fitted model very closely approximates the simulation conditions:
## simulate tree & data
set.seed(999)
tree<-pbtree(n=100,scale=1)
X<-fastBM(tree,nsim=4)
colnames(X)<-c("x1","x2","x3","x4")
beta<-c(1,1,-2,0,3)
y<-cbind(rep(1,Ntip(tree)),X)%*%beta+fastBM(tree)
v<-setNames(rexp(n=Ntip(tree)),tree$tip.label)
ye<-setNames(sampleFrom(xbar=y,xvar=v,n=rep(1,length(y))),
rownames(y))
X<-as.data.frame(cbind(ye,X))
SE<-sqrt(v)
No comments:
Post a Comment
Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.