Wednesday, February 24, 2016

Adding node labels (including bootstrap values) to a tree plotted using dotTree

Yesterday, a phytools blog reader asked the following:

Thank you very much for this great addition. May I ask if there's any way to add bootstrap value by each node on the tree plotted by dotTree?

The answer is “yes.” The internal plotting function used by dotTree (phylogram), though not in the namespace of phytools, nonetheless works in similar ways to the S3 plotting method plot.phylo in ape, and other phytools plotting functions such as plotTree, in that it sets the environmental variable "last_plot.phylo" & thus can be used with functions such as nodelabels and tiplabels from the ape package.

Here's a demo which assumes that the bootstrap values are stored as node labels on the object of class "phylo":

library(phytools)
tree$node.label
##  [1] ""   "76" "88" "88" "80" "91" "99" "95" "85" "89" "81" "83" "98" "98"
## [15] "99" "77" "80" "87" "95" "78" "76" "92" "79" "93" "79"
dotTree(tree,X,standardize=TRUE,colors="grey")
nodelabels(tree$node.label,node=2:tree$Nnode+Ntip(tree),
    adj=c(1,-0.2),frame="none")

plot of chunk unnamed-chunk-1

Here, I am going to show the bootstrap proportion as a pie chart at each node:

dotTree(tree,X,standardize=TRUE,colors="grey")
nodelabels(node=1:tree$Nnode+Ntip(tree),
    pie=cbind(as.numeric(tree$node.label),100-as.numeric(tree$node.label)),
    piecol=c("black","white"),cex=0.5)

plot of chunk unnamed-chunk-2

Note that we have to be very careful with things like bootstrap values & posterior probabilities in R because though they are usually stored as node labels, they are actually associted with edges (i.e., splits) not with nodes! This distinction become important if we re-root the tree. Perhaps there should be more on this topic in a future post.

Data for this demo were simulated as follows:

tree<-pbtree(n=26,tip.label=LETTERS) ## random tree
X<-fastBM(tree,nsim=2) ## random traits
## random bootstraps
tree$node.label<-c("",round(runif(n=tree$Nnode-1,min=75,max=100)))

Tuesday, February 23, 2016

Extracting the terminal edge lengths for a set of tips

A phytools user asked the following:

Is there a good way to extract the branch length for a given set of tips? For a given tree, we would like to calculate the mean and variance for the terminal branch lengths leading to a specific set of tips. Is there a way in phytools to target a set of tips and extract those branch lengths?

If I understand the question properly, then what we have is a tree:

tree
## 
## Phylogenetic tree with 26 tips and 25 internal nodes.
## 
## Tip labels:
##  A, B, C, D, E, F, ...
## 
## Rooted; includes branch lengths.

and a set of tips:

tips
##  [1] "K" "V" "P" "S" "J" "Q" "X" "B" "H" "D"

and we want to extract the set of terminal edge lengths associated with these tips. This is easy.

## first get the node numbers of the tips
nodes<-sapply(tips,function(x,y) which(y==x),y=tree$tip.label)
## then get the edge lengths for those nodes
edge.lengths<-setNames(tree$edge.length[sapply(nodes,
    function(x,y) which(y==x),y=tree$edge[,2])],names(nodes))

We can check as follows:

## our edge lengths
edge.lengths
##          K          V          P          S          J          Q 
## 0.26489330 0.76661579 0.04553788 0.15279683 0.13592584 0.04553788 
##          X          B          H          D 
## 0.05511474 0.59774131 1.28135176 0.83764106
plotTree(tree)
edgelabels(round(tree$edge.length,3),cex=0.7)

plot of chunk unnamed-chunk-4

That's all there is to it.

The data used for this exercise were simulated as follows:

library(phytools)
tree<-pbtree(n=26,tip.label=LETTERS)
tips<-sample(LETTERS,10)

Monday, February 22, 2016

PCM workshop in Puerto Rico / Taller en métodos comparados en San Juan, Puerto Rico

Photo from wikitravel.org

Intensive short course on macroevolution and phylogenetic comparative methods in R

We are pleased to announce a new graduate-level intensive short course on the use of R for phylogenetic comparative analysis and downstream implementation in macroevolutionary studies. The course will be four days in length and will take place at the Hyatt House Hotel of San Juan from the 28th of June to the 1st of July, 2016. This course is partially funded by the National Science Foundation, with additional support from the University of Massachusetts Boston and the University of Puerto Rico, Río Piedras. There are a number of full stipends available to cover the cost of travel, room and board for qualified students and post-docs. Applicants are welcome from any country; however, we expect that most admitted students will come from the Caribbean region and Latin America. Accepted students from further afield may be offered only partial funding for their travel expenses. Topics covered will include: an introduction to the R scientific computing environment, tree manipulation, independent contrasts and phylogenetic generalized least squares, ancestral state reconstruction, models of character evolution, diversification analysis, and visualization methods for phylogenies and comparative data. Course instructors will include Dr. Liam Revell (University of Massachusetts Boston), Dr. Luke Harmon (University of Idaho), Dr. Mike Alfaro (University of California, Los Angeles), and Dr. Ricardo Betancur (University of Puerto Rico).

Instruction in the course will be primarily in English; however some of the instructors and TAs of the course are competent or fluent in Spanish and English. Discussion, exercises, and activities will be conducted in both languages.

To apply for the course, please submit your CV along with a short (1 page) description of your research interests, background, and reasons for taking the course. Admission is competitive, and preference will go towards students with background in phylogenetics and a compelling motivation for taking the course. In your application please indicate your preferred travel airport, if appropriate. Applications should be submitted by email to pr.phylogenetics.course@gmail.com by April 1st, 2016. Applications may be written in English or Spanish; however all students must have a basic working knowledge of scientific English. Questions can be directed to liam.revell@umb.edu.


Curso de macroevolución y uso de métodos filogenéticos comparativos en R

Nos complace anunciar un nuevo curso intensivo tipo taller para estudiantes graduados/posgrado sobre el uso de R en métodos filogenéticos comparativos con enfoque a estudios sobre macroevolución. El curso tendrá una duración de cuatro días y se llevará a cabo en el Hyatt House Hotel de San Juan, Puerto Rico, entre el 28 de junio y el 1 de julio de 2016. Este curso estará parcialmente financiado por la National Science Foundation (Estados Unidos), con apoyo adicional de la University of Massachusetts Boston y la Universidad de Puerto Rico, Río Piedras. Hay varios estipendios completos disponibles para cubrir los costos de tiquetes de avión y alojamiento para estudiantes e investigadores postdoctorales calificados. Solicitudes de cualquier país serán recibidas; sin embargo, anticipamos que la mayoría de los estudiantes admitidos serán de la región Caribe y otros países latinoamericanos. Estudiantes provenientes de países más lejanos que resulten elegidos tendrán la posibilidad de recibir únicamente apoyo parcial para costear sus gastos del viaje. Los temas que serán discutidos en el curso incluyen: una introducción al ambiente computacional en R, manipulación de árboles filogenéticos, mínimos cuadrados generalizados en un contexto filogenético, reconstrucciones de estados ancestrales, modelos de evolución, análisis de diversificación filogenética, y visualización de filogenias y datos comparativos. Los instructores del curso serán: Dr. Liam Revell (University of Massachusetts Boston), Dr. Luke Harmon (University of Idaho), Dr. Mike Alfaro (University of California, Los Angeles), y Dr. Ricardo Betancur (Universidad de Puerto Rico).

El curso será dictado principalmente en inglés; sin embargo, algunos de los instructores y ayudantes de enseñanza del curso hablan español fluido. Las discusiones, los ejercicios, y las actividades del curso se harán en español e inglés.

Los interesados en solicitar admisión deben enviar su hoja de vida (CV) y una descripción corta (1 página) de sus intereses científicos, experiencia, y razones por las cuales quieren tomar el curso. El proceso de admisión será competitivo, y se dará preferencia a estudiantes con conocimientos en filogenética y que estén desarrollando investigación relacionada con los temas del curso. En la solicitud debe indicarse el aeropuerto de viaje preferido (si aplica). Las solicitudes pueden ser escritas en inglés o español y deben ser enviadas por email a pr.phylogenetics.course@gmail.com antes del 1 abril, 2016. Se espera que todos los estudiantes tengan un nivel básico de inglés científico. Preguntas adicionales pueden ser dirigidas a liam.revell@umb.edu.


Sunday, February 21, 2016

dotTree for discrete character data

I just added the feature to dotTree (previously described previously on this blog here: 1, 2) to permit discrete characters. In this case the dots will simply be colored by state.

For a single trait a similar effect can be produced using tiplabels in the ape package; however the advantage of dotTree is that it an easily simultaneously represent the tip states for a large number of traits.

Here's a quick demo with 6 binary traits:

library(phytools)
packageVersion("phytools")
## [1] '0.5.18'
tree
## 
## Phylogenetic tree with 50 tips and 49 internal nodes.
## 
## Tip labels:
##  t12, t16, t29, t30, t14, t15, ...
## 
## Rooted; includes branch lengths.
X
##     [,1] [,2] [,3] [,4] [,5] [,6]
## t12 "a"  "b"  "b"  "a"  "b"  "b" 
## t16 "a"  "b"  "b"  "a"  "a"  "b" 
## t29 "a"  "b"  "a"  "a"  "a"  "a" 
## t30 "b"  "a"  "a"  "b"  "a"  "a" 
## t14 "b"  "b"  "a"  "a"  "a"  "a" 
## t15 "a"  "a"  "a"  "b"  "a"  "a" 
## t2  "b"  "b"  "a"  "a"  "a"  "a" 
## t32 "b"  "b"  "a"  "a"  "b"  "a" 
## t33 "b"  "b"  "a"  "a"  "b"  "a" 
## t21 "b"  "a"  "a"  "b"  "a"  "a" 
## t47 "b"  "b"  "b"  "a"  "a"  "a" 
## t48 "b"  "b"  "b"  "a"  "a"  "a" 
## t34 "b"  "b"  "b"  "a"  "b"  "a" 
## t23 "b"  "a"  "b"  "b"  "a"  "a" 
## t25 "b"  "b"  "b"  "a"  "a"  "a" 
## t26 "b"  "b"  "a"  "a"  "b"  "a" 
## t9  "a"  "a"  "b"  "a"  "a"  "a" 
## t10 "a"  "a"  "b"  "b"  "a"  "a" 
## t3  "b"  "b"  "b"  "b"  "a"  "a" 
## t8  "b"  "a"  "a"  "b"  "b"  "a" 
## t43 "a"  "a"  "a"  "a"  "a"  "a" 
## t44 "a"  "a"  "a"  "a"  "a"  "a" 
## t39 "a"  "a"  "b"  "b"  "b"  "b" 
## t40 "a"  "a"  "a"  "b"  "b"  "b" 
## t1  "b"  "a"  "a"  "b"  "a"  "b" 
## t4  "b"  "b"  "b"  "b"  "a"  "a" 
## t5  "b"  "b"  "a"  "b"  "a"  "a" 
## t13 "b"  "b"  "a"  "b"  "b"  "b" 
## t41 "b"  "a"  "a"  "a"  "b"  "a" 
## t42 "b"  "a"  "b"  "a"  "b"  "a" 
## t17 "b"  "a"  "b"  "a"  "b"  "b" 
## t18 "b"  "b"  "b"  "a"  "b"  "a" 
## t27 "a"  "b"  "b"  "a"  "b"  "b" 
## t28 "b"  "b"  "a"  "a"  "b"  "b" 
## t24 "b"  "b"  "b"  "a"  "b"  "b" 
## t49 "b"  "a"  "a"  "a"  "b"  "a" 
## t50 "b"  "a"  "a"  "a"  "b"  "a" 
## t22 "b"  "b"  "a"  "b"  "a"  "a" 
## t45 "b"  "b"  "a"  "a"  "a"  "a" 
## t46 "b"  "b"  "a"  "a"  "a"  "a" 
## t38 "b"  "b"  "a"  "a"  "b"  "a" 
## t11 "b"  "b"  "b"  "a"  "b"  "a" 
## t19 "a"  "a"  "a"  "a"  "b"  "b" 
## t31 "a"  "b"  "a"  "a"  "a"  "b" 
## t35 "a"  "a"  "a"  "a"  "b"  "a" 
## t36 "b"  "b"  "a"  "a"  "b"  "b" 
## t37 "a"  "a"  "a"  "a"  "b"  "a" 
## t20 "a"  "a"  "a"  "b"  "a"  "b" 
## t7  "a"  "a"  "a"  "b"  "a"  "b" 
## t6  "a"  "b"  "a"  "a"  "b"  "b"
colors<-setNames(c("blue","red"),c("a","b"))
colors
##      a      b 
## "blue"  "red"
dotTree(tree,X,colors=colors,data.type="discrete",fsize=0.7)

plot of chunk unnamed-chunk-1

Like dotTree for data.type="continuous", it also has a method using plotTree internally which works for one character. For instance:

X[,1]
## t12 t16 t29 t30 t14 t15  t2 t32 t33 t21 t47 t48 t34 t23 t25 t26  t9 t10 
## "a" "a" "a" "b" "b" "a" "b" "b" "b" "b" "b" "b" "b" "b" "b" "b" "a" "a" 
##  t3  t8 t43 t44 t39 t40  t1  t4  t5 t13 t41 t42 t17 t18 t27 t28 t24 t49 
## "b" "b" "a" "a" "a" "a" "b" "b" "b" "b" "b" "b" "b" "b" "a" "b" "b" "b" 
## t50 t22 t45 t46 t38 t11 t19 t31 t35 t36 t37 t20  t7  t6 
## "b" "b" "b" "b" "b" "b" "a" "a" "a" "b" "a" "a" "a" "a"
dotTree(tree,X[,1],fsize=0.7,ftype="i") ## default colors

plot of chunk unnamed-chunk-2

The code of this update can be seen here.

The data above were simulated as follows:

tree<-pbtree(n=50)
Q<-matrix(c(-1,1,1,-1),2,2,dimnames=list(letters[1:2],letters[1:2]))
X<-replicate(6,sim.history(tree,Q)$states)

Finally, phytools can be installed from GitHub using devtools, e.g.:

library(devtools)
install_github("liamrevell/phytools")