Sunday, April 9, 2017

New S3 print method for evolvcv.lite

I just pushed a small phytools update to GitHub which consists mostly of a new S3 print method for the phytools function evolvcv.lite. print methods, for the unitiated, are special functions that R uses to (usually) summarize the contents of complex objects created in memory in R.

The function evolvcv.lite (essentially a variante of phytools::evol.vcv) is based on an article published by David Collar & I back in 2009. The method is one which we fit different covariance structure for multivariate Brownian motion evolution in different a priori specified parts of the tree. What evolvcv.lite does in addition is fits a series of intermediate models - for instance, in which the evolution correlation is constant, but the variances changes, or vice versa. In the end it fits four different models in total - from one 'null' model of common covariance structure, through a no-common-structure model (essentially matching evol.vcv). The 'lite' aspect comes from the fact that it only permits two traits to be modeled - although I believe we can fit an arbitrary number of regimes.

Here is a quick demo of the new print method:

library(phytools)
packageVersion("phytools")
## [1] '0.6.3'
## read centrarchid tree
fish.tree<-read.tree("http://www.phytools.org/blog/Centrarchidae.tre")
fish.tree
## 
## Phylogenetic tree with 28 tips and 27 internal nodes.
## 
## Tip labels:
##  Acantharchus_pomotis, Lepomis_gibbosus, Lepomis_microlophus, Lepomis_punctatus, Lepomis_miniatus, Lepomis_auritus, ...
## 
## Rooted; includes branch lengths.
## read centrarchid data
fish.data<-read.csv("http://www.phytools.org/blog/Centrarchidae.csv",
    header=TRUE,row.names=1)
fish.data
##                         feeding.mode gape.width buccal.length
## Acantharchus_pomotis            pisc      0.114        -0.009
## Lepomis_gibbosus                 non     -0.133        -0.009
## Lepomis_microlophus              non     -0.151         0.012
## Lepomis_punctatus                non     -0.103        -0.019
## Lepomis_miniatus                 non     -0.134         0.001
## Lepomis_auritus                  non     -0.222        -0.039
## Lepomis_marginatus               non     -0.187        -0.075
## Lepomis_megalotis                non     -0.073        -0.049
## Lepomis_humilis                  non      0.024        -0.027
## Lepomis_macrochirus              non     -0.191         0.002
## Lepomis_gulosus                 pisc      0.131         0.122
## Lepomis_symmetricus              non      0.013        -0.025
## Lepomis_cyanellus               pisc     -0.002        -0.009
## Micropterus_coosae              pisc      0.045        -0.009
## Micropterus_notius              pisc      0.097        -0.009
## Micropterus_treculi             pisc      0.056         0.001
## Micropterus_salmoides           pisc      0.056        -0.059
## Micropterus_floridanus          pisc      0.096         0.051
## Micropterus_punctulatus         pisc      0.092         0.002
## Micropterus_dolomieu            pisc      0.035        -0.069
## Centrarchus_macropterus          non     -0.007        -0.055
## Enneacantus_obesus               non      0.016        -0.005
## Pomoxis_annularis               pisc     -0.004        -0.019
## Pomoxis_nigromaculatus          pisc      0.105         0.041
## Archolites_interruptus          pisc     -0.024         0.051
## Ambloplites_ariommus            pisc      0.135         0.123
## Ambloplites_rupestris           pisc      0.086         0.041
## Ambloplites_cavifrons           pisc      0.130         0.040
## this is a binary character that we're going to use for our regimes
fmode<-setNames(fish.data[,1],rownames(fish.data))
fmode
##    Acantharchus_pomotis        Lepomis_gibbosus     Lepomis_microlophus 
##                    pisc                     non                     non 
##       Lepomis_punctatus        Lepomis_miniatus         Lepomis_auritus 
##                     non                     non                     non 
##      Lepomis_marginatus       Lepomis_megalotis         Lepomis_humilis 
##                     non                     non                     non 
##     Lepomis_macrochirus         Lepomis_gulosus     Lepomis_symmetricus 
##                     non                    pisc                     non 
##       Lepomis_cyanellus      Micropterus_coosae      Micropterus_notius 
##                    pisc                    pisc                    pisc 
##     Micropterus_treculi   Micropterus_salmoides  Micropterus_floridanus 
##                    pisc                    pisc                    pisc 
## Micropterus_punctulatus    Micropterus_dolomieu Centrarchus_macropterus 
##                    pisc                    pisc                     non 
##      Enneacantus_obesus       Pomoxis_annularis  Pomoxis_nigromaculatus 
##                     non                    pisc                    pisc 
##  Archolites_interruptus    Ambloplites_ariommus   Ambloplites_rupestris 
##                    pisc                    pisc                    pisc 
##   Ambloplites_cavifrons 
##                    pisc 
## Levels: non pisc
## a single stochastic map
fish.tree<-make.simmap(fish.tree,fmode,model="ARD")
## make.simmap is sampling character histories conditioned on the transition matrix
## 
## Q =
##            non      pisc
## non  -6.087789  6.087789
## pisc  3.048905 -3.048905
## (estimated using likelihood);
## and (mean) root node prior probabilities
## pi =
##  non pisc 
##  0.5  0.5
## Done.

Note that, as always, we should normally repeat our optimization across a set of stochastic map character histories - not a single such history, as in this case! cols<-setNames(c(“blue”,“red”),c(“non”,“pisc”))

cols<-setNames(c("blue","red"),c("non","pisc"))
plot(fish.tree,colors=cols,ftype="i")
add.simmap.legend(colors=cols,prompt=FALSE,x=0,y=Ntip(fish.tree))

plot of chunk unnamed-chunk-2

## now we can pull out our character data
fish.X<-as.matrix(fish.data[,2:3])
fish.X
##                         gape.width buccal.length
## Acantharchus_pomotis         0.114        -0.009
## Lepomis_gibbosus            -0.133        -0.009
## Lepomis_microlophus         -0.151         0.012
## Lepomis_punctatus           -0.103        -0.019
## Lepomis_miniatus            -0.134         0.001
## Lepomis_auritus             -0.222        -0.039
## Lepomis_marginatus          -0.187        -0.075
## Lepomis_megalotis           -0.073        -0.049
## Lepomis_humilis              0.024        -0.027
## Lepomis_macrochirus         -0.191         0.002
## Lepomis_gulosus              0.131         0.122
## Lepomis_symmetricus          0.013        -0.025
## Lepomis_cyanellus           -0.002        -0.009
## Micropterus_coosae           0.045        -0.009
## Micropterus_notius           0.097        -0.009
## Micropterus_treculi          0.056         0.001
## Micropterus_salmoides        0.056        -0.059
## Micropterus_floridanus       0.096         0.051
## Micropterus_punctulatus      0.092         0.002
## Micropterus_dolomieu         0.035        -0.069
## Centrarchus_macropterus     -0.007        -0.055
## Enneacantus_obesus           0.016        -0.005
## Pomoxis_annularis           -0.004        -0.019
## Pomoxis_nigromaculatus       0.105         0.041
## Archolites_interruptus      -0.024         0.051
## Ambloplites_ariommus         0.135         0.123
## Ambloplites_rupestris        0.086         0.041
## Ambloplites_cavifrons        0.130         0.040

Finally, we are ready to fit our models:

fitMV.models<-evolvcv.lite(fish.tree,fish.X)
fitMV.models
## Model 1: common rates, common correlation 
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)  AIC
## fitted   0.114   0.033   0.0556  5   72.1893 -134.3786   
## 
## (R thinks it has found the ML solution for model 1.)
## 
## Model 2: different rates, common correlation
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)  AIC
## non  0.1989  0.027   0.0169  7   78.8252 -143.6505   
## pisc 0.0574  0.0307  0.0757  
## 
## (R thinks it has found the ML solution for model 2.)
## 
## Model 3: common rates, different correlation
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)  AIC
## non  0.0938  -0.0011 0.0648  6   74.7265 -137.4529   
## pisc 0.0938  0.0555  0.0648  
## 
## (R thinks it has found the ML solution for model 3.)
## 
## Model 4: no common structure
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)  AIC
## non  0.1459  -0.005  0.0119  8   83.0425 -150.085    
## pisc 0.0708  0.0637  0.0941  
## 
## (R thinks it has found the ML solution for model 4.)

We can easily compare this to evol.vcv:

fitMV<-evol.vcv(fish.tree,fish.X)
fitMV
## ML single-matrix model:
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)
## fitted   0.114   0.033   0.0556  5   72.1893 
## 
## ML multi-matrix model:
##  R[1,1]  R[1,2]  R[2,2]  k   log(L)
## non  0.1458  -0.005  0.0118  8   83.0426 
## pisc 0.0707  0.0636  0.0939  
## 
## P-value (based on X^2): 1e-04 
## 
## R thinks it has found the ML solution.

For those interested, here is what the code of the S3 print method looks like:

print.evolvcv.lite<-function(x,...){
    if(hasArg(digits)) digits<-list(...)$digits
    else digits<-4
    ## MODEL 1  
    m1<-lapply(x$model1,function(a,b) if(is.numeric(a)) round(a,b) else a,b=digits)
    cat(paste("Model 1:",x$model1$description,"\n"))
    nn<-paste("R[",t(sapply(1:ncol(m1$R),paste,1:ncol(m1$R),sep=","))[upper.tri(m1$R,
        diag=TRUE)],"]",sep="")
    cat(paste("\t",paste(nn,collapse="\t"),"\tk\tlog(L)\tAIC","\n",sep=""))
    cat(paste("fitted",paste(m1$R[upper.tri(m1$R,diag=TRUE)],collapse="\t"),m1$k,
        m1$logLik,m1$AIC,"\n",sep="\t"))
    if(m1$convergence==0) cat("\n(R thinks it has found the ML solution for model 1.)\n\n")
    else cat("\n(Model 1 optimization may not have converged.)\n\n")
    ## MODELS 2-4
    for(i in 2:4){
        m<-lapply(x[[i]],function(a,b) if(is.numeric(a)) round(a,b) else a,b=digits)
        m$R<-lapply(m$R,function(a,b) if(is.numeric(a)) round(a,b) else a,b=digits)
        cat(paste("Model ",i,": ",m$description,"\n",sep=""))
        cat(paste("\t",paste(nn,collapse="\t"),"\tk\tlog(L)\tAIC","\n",sep=""))
        for(j in 1:length(m$R)){
            if(j==1) cat(paste(names(m$R)[j],paste(m$R[[j]][upper.tri(m$R[[j]],diag=TRUE)],
                collapse="\t"),m$k,m$logLik,m$AIC,"\n",sep="\t"))
            else cat(paste(names(m$R)[j],paste(m$R[[j]][upper.tri(m$R[[j]],diag=TRUE)],
                collapse="\t"),"\n",sep="\t"))
        }
        if(m$convergence==0) 
            cat(paste("\n(R thinks it has found the ML solution for model ",i,".)\n\n",sep=""))
        else cat(paste("\n(Model ",i,"optimization may not have converged.)\n\n",sep=""))
    }
}

No comments:

Post a Comment