Wednesday, September 20, 2017

An empirical case study using ratebytree: Body size & shape evolution in two lizard clades

As I have mentioned in a couple of previous posts, I am working on revision for a manuscript examining the statistical properties of the function ratebytree.

So far, to directly or indirectly address the reviewer & editor criticisms, I have:

1) Added the option of comparing parameters under other evolutionary models such as OU and EB (1, 2).

2) Added the option of comparing the rate of discrete character evolution under Mk models of various flavors (1, 2).

3) Added the calcuation of standard errors and a posthoc method for the object class (1, 2).

In addition, the editor requested that we include an empirical example. We are using the method in another in prep. study in which we (among other things) compare the rate of niche evolution among clades, but I thought I'd go back a few years and find another nice example.

After some searching, I settled on the relatively simple case of comparing the rate of body size & shape evolution between two lizard clades: the North American iguanian subfamily of Phrynosomatinae, and the diverse South American clade Liolaemidae.

Here are photos of each from Wikipedia to color up the post:

missing
A western fence lizard from Phrynosomatidae, Sceloporus occidentalis.
missing
A liolaemid, Liolaemus tenuis.

I am using the data of Harmon et al. (2010). The data is freely distributed through Data Dryad, so anyone who likes can download the files & follow along.

Both are relatively species rich lizard families or subfamilies - although our trees & data in each case only contain a relatively small number of taxa: 70 & 67 species, respectively.

I have also re-used the total tree ages of Harmon et al. - although the seemed a little unrealistic: 75 & 74 million years, respectively. No matter - our method is only sensitive to relative age, so in this case we are merely assuming that the two clades have approximately the same total age.

First, I will compare body size (SVL) evolution between the two groups.

Load packages:

library(phytools)
library(geiger)

These are the ages used in Harmon et al.:

ages<-setNames(c(75,74),c("phryn","liol"))
ages
## phryn  liol 
##    75    74
## a simple function to rescale
rescaleTree<-function(t,d){
    h<-max(nodeHeights(t))
    t$edge.length<-t$edge.length/h*d
    t
}

Now start with Phrynosomatinae:

phryn.tree<-read.tree("phrynosomatpl.phy")
phryn.tree<-rescaleTree(phryn.tree,ages["phryn"]/100)

Here I have chosen to rescale both trees to total depth in 100s of millions of years. This just makes our parameter values 100× larger & helps with optimization - but it does not affect the model fit of our different models.

phryn.data<-read.csv("iguanids.csv",row.names=1,header=TRUE,
    na.string=".")
chk<-name.check(phryn.tree,phryn.data)
chk
## $tree_not_data
## character(0)
## 
## $data_not_tree
##   [1] "amcris"   "anacut"   "anaen"    "anahl"    "analin"   "analli"  
##   [7] "anallog"  "analut"   "anang"    "anarg"    "anbah"    "anbarh"  
##  [13] "anbart"   "anbima"   "anbrem"   "anbrun"   "ancent"   "anchlo"  
##  [19] "anchri"   "ancoel"   "ancons"   "ancrist"  "ancuv"    "ancyb"   
##  [25] "andist"   "aneque"   "anever"   "anferr"   "angarm"   "angrah"  
##  [31] "angris"   "angund"   "anhomo"   "aninsol"  "anisol"   "anjub"   
##  [37] "ankrug"   "anleach"  "anline"   "anlon"    "anlong"   "anlontib"
##  [43] "anloy"    "anluce"   "anlucs"   "anluteo"  "anmarm"   "anmayn"  
##  [49] "anmest"   "anmone"   "annobl"   "anoccu"   "anolss"   "anopal"  
##  [55] "anophio"  "anpater"  "anpogu"   "anponc"   "anporc"   "anpulch" 
##  [61] "anpum"    "anrich"   "anroque"  "ansagr"   "ansemil"  "ansing"  
##  [67] "anstra"   "anstrat"  "anstrint" "anvalen"  "anvanid"  "anverm"  
##  [73] "anwatt"   "basbas"   "basgal"   "basplum"  "basvitt"  "brafas"  
##  [79] "chamad"   "cmbarbo"  "cmporc"   "consub"   "corcris"  "corhern" 
##  [85] "corperc"  "crotcol"  "ctenads"  "ctspect"  "ctssim"   "cycnub"  
##  [91] "dipalto"  "dipdar"   "dipsdor"  "ensleec"  "enylat"   "gamwis"  
##  [97] "hopspin"  "igig"     "laelong"  "laeserr"  "leabell"  "leacat"  
## [103] "leiocar"  "leiopers" "leioschr" "lioaba"   "lioalb"   "lioalt"  
## [109] "lioand"   "lioaud"   "liobel"   "liobib"   "liobit"   "liobuer" 
## [115] "liocanq"  "liochac"  "liochil"  "liocoer"  "liocuy"   "liodar"  
## [121] "liodono"  "liodorb"  "lioelong" "liofama"  "liofitz"  "liofus"  
## [127] "liograc"  "liograv"  "liohuac"  "lioirr"   "lioking"  "liokosl" 
## [133] "liokrie"  "liolaur"  "liolem"   "liolep"   "lioline"  "liolutz" 
## [139] "liomelan" "liomont"  "liomult"  "liomumc"  "lionig"   "lionitid"
## [145] "lionvr"   "lioocci"  "lioolon"  "lioorna"  "liopaul"  "liopetro"
## [151] "liopict"  "lioplat"  "liopoly"  "liopseud" "lioquil"  "liorioj" 
## [157] "liorober" "lioroth"  "lioruib"  "liosalin" "lioscap"  "lioschr" 
## [163] "liotenu"  "liowalk"  "liowieg"  "lioxant"  "liozap"   "morann"  
## [169] "opcuv"    "plicplic" "poacut"   "pomarm"   "prissca"  "pympal"  
## [175] "pympun"   "pymsom"   "pymzap"   "sauobes"  "stang"    "stapur"  
## [181] "stazur"   "stboet"   "stcad"    "stchot"   "stcrass"  "stcupr"  
## [187] "stdoell"  "stemp"    "steune"   "stfest"   "stfimb"   "stguen"  
## [193] "sthume"   "stimit"   "stirid"   "stlate"   "stlimi"   "stmelano"
## [199] "stocho"   "storien"  "storna"   "stperc"   "stpuyan"  "strhodo" 
## [205] "strosei"  "ststig"   "stvari"   "troeth"   "trospin"  "uphvaut" 
## [211] "uraflav"  "urnsup"

This tells us that we have some species in our data not in the tree, but not the converse. Let's just remove these from our dataset:

phryn.data<-phryn.data[phryn.tree$tip.label,]
phryn.size<-log(as.matrix(phryn.data)[,1]) ## size on a log scale.

Next, Liolaemidae:

liol.tree<-read.tree("liolaeminipl.phy")
liol.tree<-rescaleTree(liol.tree,ages["liol"]/100)
liol.data<-read.csv("iguanids.csv",row.names=1,header=TRUE,
    na.string=".")
chk<-name.check(liol.tree,liol.data)
chk
## $tree_not_data
## character(0)
## 
## $data_not_tree
##   [1] "amcris"   "anacut"   "anaen"    "anahl"    "analin"   "analli"  
##   [7] "anallog"  "analut"   "anang"    "anarg"    "anbah"    "anbarh"  
##  [13] "anbart"   "anbima"   "anbrem"   "anbrun"   "ancent"   "anchlo"  
##  [19] "anchri"   "ancoel"   "ancons"   "ancrist"  "ancuv"    "ancyb"   
##  [25] "andist"   "aneque"   "anever"   "anferr"   "angarm"   "angrah"  
##  [31] "angris"   "angund"   "anhomo"   "aninsol"  "anisol"   "anjub"   
##  [37] "ankrug"   "anleach"  "anline"   "anlon"    "anlong"   "anlontib"
##  [43] "anloy"    "anluce"   "anlucs"   "anluteo"  "anmarm"   "anmayn"  
##  [49] "anmest"   "anmone"   "annobl"   "anoccu"   "anolss"   "anopal"  
##  [55] "anophio"  "anpater"  "anpogu"   "anponc"   "anporc"   "anpulch" 
##  [61] "anpum"    "anrich"   "anroque"  "ansagr"   "ansemil"  "ansing"  
##  [67] "anstra"   "anstrat"  "anstrint" "anvalen"  "anvanid"  "anverm"  
##  [73] "anwatt"   "basbas"   "basgal"   "basplum"  "basvitt"  "brafas"  
##  [79] "calldrac" "chamad"   "cmbarbo"  "cmporc"   "consub"   "cophtex" 
##  [85] "corcris"  "corhern"  "corperc"  "crotcol"  "ctspect"  "ctssim"  
##  [91] "cycnub"   "dipalto"  "dipdar"   "dipsdor"  "ensleec"  "enylat"  
##  [97] "gamwis"   "hollac"   "holmac"   "holprop"  "hopspin"  "igig"    
## [103] "laelong"  "laeserr"  "leabell"  "leacat"   "leiocar"  "leiopers"
## [109] "leioschr" "morann"   "opcuv"    "pettha"   "phyasio"  "phycora" 
## [115] "phycorn"  "phydoug"  "phyhern"  "phymcl"   "phymod"   "phyorb"  
## [121] "phyplat"  "physola"  "phytau"   "plicplic" "poacut"   "pomarm"  
## [127] "prissca"  "satang"   "sauobes"  "scadl"    "scbic"    "sccar"   
## [133] "sccau"    "scchry"   "scclar"   "sccouc"   "sccyan"   "scform"  
## [139] "scgadov"  "scgrac"   "scgram"   "schorr"   "schuns"   "scjala"  
## [145] "scjarr"   "sclund"   "scmac"    "scmag"    "scmala"   "scmerr"  
## [151] "scmucr"   "scoccid"  "scocho"   "scoliv"   "scorcu"   "scorn"   
## [157] "scpar"    "scpict"   "scpoin"   "scpyro"   "scscal"   "scsini"  
## [163] "scsmar"   "scspin"   "scsquam"  "scteap"   "sctorq"   "scund"   
## [169] "scutif"   "scvari"   "scvirg"   "scwood"   "sczost"   "stang"   
## [175] "stapur"   "stazur"   "stboet"   "stcad"    "stchot"   "stcrass" 
## [181] "stcupr"   "stdoell"  "stemp"    "steune"   "stfest"   "stfimb"  
## [187] "stguen"   "sthume"   "stimit"   "stirid"   "stlate"   "stlimi"  
## [193] "stmelano" "stocho"   "storien"  "storna"   "stperc"   "stpuyan" 
## [199] "strhodo"  "strosei"  "ststig"   "stvari"   "troeth"   "trospin" 
## [205] "umaino"   "umascop"  "uphvaut"  "uraflav"  "urnsup"   "urograc" 
## [211] "uromic"   "uronigr"  "uroorn"   "utapalm"  "utastan"
liol.data<-liol.data[liol.tree$tip.label,]
liol.size<-log(as.matrix(liol.data)[,1])

Let's visualize our data:

par(mfrow=c(1,2))
phenogram(phryn.tree,phryn.size,spread.cost=c(1,0),ftype="i",fsize=0.6,
    ylab="log(SVL)",xlab="time my x 100",col="lightblue",
    ylim=range(c(phryn.size,liol.size)))
## Optimizing the positions of the tip labels...
title(main="a) Phrynosomatinae body size",cex.main=1.1)
labels<-setNames(seq(min(c(phryn.size,liol.size)),max(c(phryn.size,liol.size)),
    by=diff(range(c(phryn.size,liol.size)))/(Ntip(liol.tree)-1)),
    names(sort(liol.size)))
phenogram(liol.tree,liol.size,spread.cost=c(1,0),ftype="i",fsize=0.6,
    ylab="log(SVL)",xlab="time my x 100",col="lightgreen",
    ylim=range(c(phryn.size,liol.size)),
    label.pos=labels)
## Optimizing the positions of the tip labels...
title(main="b) Liolaemidae body size",cex.main=1.1)

plot of chunk unnamed-chunk-7

Now we can fit our model:

fit.size<-ratebytree(c(phryn.tree,liol.tree),list(phryn.size,
    liol.size))
fit.size
## ML common-rate model:
##          s^2  a[1]   a[2]    k   logL
## value    0.2613  4.1772  4.2621  3   -4.8496 
## SE       0.0316  0.1478  0.2394  
## 
## ML multi-rate model:
##          s^2[1] s^2[2]   a[1]   a[2]    k   logL
## value    0.1911  0.3346  4.1772  4.2621  4   -2.1877 
## SE       0.0323  0.0578  0.1264  0.2709  
## 
## Likelihood ratio: 5.3238 
## P-value (based on X^2): 0.021 
## 
## R thinks it has found the ML solution.
posthoc(fit.size)
## 
## Post-hoc test for "BM" model.
## (Comparison is of estimated values of sigma^2.)
## 
##                  t       df      P
## tree1 vs. 2 -2.167 101.0276 0.0326
## 
## P-values adjusted using method="none".

This tells us that clade #2, Liolaemidae, has about a 1.5-fold higher rate of body size evolution than does clade #1, the phrynosomatines, and that this difference is highly significant.

Note that this occurs even though for our taxa the range of variation is slightly greater in phrynosomatines than in Liolaemidae. How come? Most likely it is because greater evolutionary changes have occurred between close relatives in Liolaemidae than in phrynosomatines.

Next, on to body shape. For body shape I decided to use a slightly different approach than in Harmon et al. in that I decided to first compute a mean (phylogenetic) eigenstructure for the different traits of each group, and then compute scores in this space, rather than in separate spaces as in Harmon et al. This (I think) makes “shape” for the two different clades more directly comparable - although it is only possible here because the traits are precisely the same. (Harmon et al. also compared shape evolution in clades measured for different traits.)

One complication that we have to deal with is that some characters for some taxa are missing. I just decided to remove these from the datasets (and trees):

taxa.na<-names(which(apply(phryn.data,1,function(x) any(is.na(x)))))
taxa.na
## [1] "sclund" "scpar"  "sccouc"
phryn.tree<-drop.tip(phryn.tree,taxa.na)
phryn.data<-phryn.data[phryn.tree$tip.label,]
pca.phryn<-phyl.pca(phryn.tree,log(phryn.data))
pca.phryn
## Phylogenetic pca
## Standard deviations:
##        PC1        PC2        PC3        PC4 
## 0.90696599 0.34633365 0.09262882 0.07084188 
## Loads:
##              PC1        PC2          PC3         PC4
## SVL   -0.9572559 -0.2344029  0.167252040  0.02726240
## FIXTL -0.8816620  0.4717591 -0.002422735 -0.01046589
## TAL   -0.9396173 -0.3187826 -0.057246707 -0.11054350
## TLL   -0.9647890 -0.2130534 -0.109288095  0.10884222
taxa.na<-names(which(apply(liol.data,1,function(x) any(is.na(x)))))
taxa.na
## [1] "lionitid" "liofus"   "liozap"   "liopaul"  "liolep"   "liocoer"
liol.tree<-drop.tip(liol.tree,taxa.na)
liol.data<-liol.data[liol.tree$tip.label,]
pca.liol<-phyl.pca(liol.tree,log(liol.data))
pca.liol
## Phylogenetic pca
## Standard deviations:
##        PC1        PC2        PC3        PC4 
## 1.08598168 0.33608157 0.20326448 0.09855278 
## Loads:
##              PC1        PC2         PC3           PC4
## SVL   -0.9545595 -0.1419970  0.25571368 -0.0571277337
## FIXTL -0.9042343  0.4258316 -0.03205649 -0.0003164605
## TAL   -0.9364986 -0.2411494 -0.24315585 -0.0754485503
## TLL   -0.9666380 -0.1976977 -0.01222497  0.1624102850

Now, I'll compute the mean eigenstructure across the two datasets:

Evec<-(pca.liol$Evec+pca.phryn$Evec)/2

Finally, I can compute scores in this new space. Shape with be scores on PCA 2 as we have not removed size from this analysis:

phryn.shape<-((as.matrix(log(phryn.data))-matrix(1,nrow(phryn.data),1)%*%
    t(phyl.vcv(as.matrix(log(phryn.data)),vcv(phryn.tree),1)$alpha))%*%
    Evec)[,2]
liol.shape<-((as.matrix(log(liol.data))-matrix(1,nrow(liol.data),1)%*%
    t(phyl.vcv(as.matrix(log(liol.data)),vcv(liol.tree),1)$alpha))%*%
    Evec)[,2]

Now, we can compare the rates for our two clades!

fit.shape<-ratebytree(c(phryn.tree,liol.tree),list(phryn.shape,
    liol.shape))
fit.shape
## ML common-rate model:
##          s^2  a[1]   a[2]    k   logL
## value    0.1148  0   0   3   47.4324 
## SE       0.0144  0.0993  0.1588  
## 
## ML multi-rate model:
##          s^2[1] s^2[2]   a[1]   a[2]    k   logL
## value    0.1181  0.1112  0   0   4   47.4611 
## SE       0.0204  0.0201  0.1007  0.1563  
## 
## Likelihood ratio: 0.0574 
## P-value (based on X^2): 0.8106 
## 
## R thinks it has found the ML solution.
posthoc(fit.shape)
## 
## Post-hoc test for "BM" model.
## (Comparison is of estimated values of sigma^2.)
## 
##                  t       df      P
## tree1 vs. 2 0.2399 123.1659 0.8108
## 
## P-values adjusted using method="none".

Unlike size, shape shows no evidence for a difference in rate between clades.

5 comments:

  1. Dear dr. Revell,

    I have also SVL data of around 160 lacertid lizards. If you're interested?

    You can find the data as supplementary data for following article:

    Baeckens, S., Edwards, S., Huyghe, K. & Van Damme, R. (2015) Chemical signalling in lizards: an interspecific comparison of femoral pore numbers in Lacertidae. Biological Journal of the Linnean Society, 114, 44–57.

    I can provide you the Bayesian tree as wel.

    Let me know if you're interested:

    Simon Baeckens
    simon.baeckens@fas.harvard.edu

    ReplyDelete
  2. Liam. How do I intruduce measurement of error in the data to further analyse the data and the measurement of error to compare the rates of different traits ?

    ReplyDelete
  3. Dear Liam. I would like to know if I am comparing Two trees, asnd ratebytree output gives me a <0.05 P value, is it necessary to run the post hoc?
    Cheers

    ReplyDelete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.