Wednesday, November 4, 2015

New function to simulate species trees from genus-level tree

I just added a function to phytools to automate the task of attaching species-level subtrees simulated under a Yule process to a backbone genus-level tree. I described this here last week.

Here is how it works:

## first update phytools
library(devtools)
install_github("liamrevell/phytools")
## load
library(phytools)
## here's our hypothetical genus tree:
genus.tree
## 
## Phylogenetic tree with 16 tips and 15 internal nodes.
## 
## Tip labels:
##  Vkfns, Ikfxnyht, Qkyca, Bnlpd, Wkpws, Bkhq, ...
## 
## Rooted; includes branch lengths.
plotTree(genus.tree,ftype="i")

plot of chunk unnamed-chunk-2

## here are our tips to add
tips
##  [1] "Ikfxnyht_opqcfh" "Ikfxnyht_nvaocw" "Ikfxnyht_knzcyf"
##  [4] "Qkyca_pzdukv"    "Qkyca_jukdfi"    "Qkyca_ritdjl"   
##  [7] "Qkyca_agrjou"    "Qkyca_qoyafl"    "Qkyca_qrfsbu"   
## [10] "Bnlpd_hidqnw"    "Bnlpd_wlthqi"    "Bnlpd_gayokd"   
## [13] "Bnlpd_lqjspu"    "Wkpws_xbahqo"    "Wkpws_pwdnuv"   
## [16] "Bkhq_hqdkbx"     "Bkhq_ojrbaw"     "Bkhq_pzwyfq"    
## [19] "Bkhq_dejpxs"     "Bkhq_efnyzp"     "Bkhq_kymfvj"    
## [22] "Bgln_tfuizj"     "Bgln_tpdsuf"     "Bgln_ycaqrs"    
## [25] "Bgln_havrzp"     "Bgln_bonyms"     "Bgln_nhquvm"    
## [28] "Nrztfp_blunwa"   "Iakywem_jqhxdr"  "Lhkrtyl_usqjhl" 
## [31] "Lhkrtyl_nbjhyc"  "Lhkrtyl_wmhkcd"  "Lhkrtyl_wmdkqp" 
## [34] "Lhkrtyl_dfhlkj"  "Lhkrtyl_kpadxb"  "Javcgbph_qvgmrh"
## [37] "Javcgbph_vzqfir" "Ctfioy_yjuxhp"   "Ctfioy_bcgonv"  
## [40] "Yrvp_prchgs"     "Yrvp_zcjuqx"     "Yrvp_usjbyc"    
## [43] "Hmvlk_nymqow"    "Hmvlk_ktjxnq"    "Hmvlk_crtdau"   
## [46] "Hmvlk_hemxwi"    "Hmvlk_xygpmk"    "Hmvlk_hxofgw"   
## [49] "Fioxrcyl_pecahk" "Fioxrcyl_uifsdv" "Okmfyg_ahxljq"
## add them:
species.tree<-genus.to.species.tree(genus.tree,tips)
plotTree(species.tree,ftype="i",fsize=0.7)

plot of chunk unnamed-chunk-2

That's it.

Obviously, the data & tree above were simulated. Here is code used in simulation

## first let's simulate our genus tree:
foo<-function() paste(sample(LETTERS,1),paste(sample(letters,
    round(runif(1,min=3,max=7))),collapse=""),sep="")
genera<-replicate(16,foo())
genus.tree<-pbtree(n=length(genera),tip.label=genera,scale=80)
genus.tree$edge.length[which(genus.tree$edge[,2]<=Ntip(genus.tree))]<-
    genus.tree$edge.length[which(genus.tree$edge[,2]<=Ntip(genus.tree))]+20
tips<-c()
for(i in 1:Ntip(genus.tree)){
    n.genus<-sample(0:6,1)
    if(n.genus>0) for(j in 1:n.genus) 
        tips<-c(tips,paste(genus.tree$tip.label[i],
        paste(sample(letters,6),collapse=""),sep="_"))
}

7 comments:

  1. It may be worth pointing out to readers Abouheif 1998 "Random trees and the comparative method: A cautionary tale" Evolution 52(4): 1197-1204.

    "One of the toughest problems facing comparative biology is the paucity of robust phylogenetic hypotheses for many taxonomic groups. Martins (1996) proposed a method to analyze comparative data in the absence of a known phylogeny using randomly generated trees. Before applying this method, however, researchers should be aware that (1) parameter estimates derived from this method essentially assume a star phylogeny, and thus, estimate the same evolutionary regression or correlation coefficient as traditional cross-species analyses; and (2) statistical conclusions derived from this method may be so conservative as to mask evolutionary patterns, such as Rensch's rule, and should be interpreted with caution."

    The PDF is at http://biology.mcgill.ca/faculty/abouheif/articles/Abouheif%201998.pdf .

    This paper is about making up entire trees, not just most of the tree, as with genus.to.species.tree(). However, to me, it suggests we should be cautious: having a mostly or entirely made up tree is not the same as having a tree based on data and may lead to problems in many analyses. Nothing wrong with the genus.to.species.tree() function, mind you, but researchers need to consider whether it is safe or not to use a tree based on this or similar approaches for their particular question. Sometimes (many times, I'd wager) it's far better to wait for data or work on the available subset of data.

    ReplyDelete
    Replies
    1. Yes, I would also not recommend using this function in the way you describe. - Liam

      Delete
  2. I blog frequently and I really appreciate your information. This great article has truly peaked my interest.

    http://prokr.com/furniture-moving-company-dammam/

    shoala.net/

    ReplyDelete
  3. Really impressive post. I read it whole and going to share it with my social circules. I enjoyed your article and planning to rewrite it on my own blog.
    pengobatan tumor
    cara mengatasi perut begah
    pengobatan liver bengkak
    obat keloid
    obat ginjal bengkak
    obat thalasemia
    obat benjolan di kepala
    obat benjolan di payudara

    ReplyDelete
  4. Le meilleur moment de luxe au monde est peu co├╗teux et Replique Montreextravagant! Vous souhaitez vendre cette montre de luxe - remise franck muller les montres Tout d'abord, nous devons confirmer cette

    ReplyDelete