A phytools user today contacted me about using the function to.matrix
with
ape::tiplabels
to visualize the state of a discrete character at the tips of
the tree.
I thought it could be useful to reiterate a demonstration of how this is done.
For this I will use the phylogeny & data on feeding mode for Centrarchidae, the sunfish.
library(phytools)
data(sunfish.tree)
data(sunfish.data)
sunfish.tree
is actually a "simmap"
object, but we just want a simple
"phylo"
tree here:
sunfish.tree
##
## Phylogenetic tree with 28 tips and 27 internal nodes.
##
## Tip labels:
## Acantharchus_pomotis, Lepomis_gibbosus, Lepomis_microlophus, Lepomis_punctatus, Lepomis_miniatus, Lepomis_auritus, ...
##
## The tree includes a mapped, 2-state discrete character with states:
## non, pisc
##
## Rooted; includes branch lengths.
sunfish.tree<-as.phylo(sunfish.tree)
sunfish.tree
##
## Phylogenetic tree with 28 tips and 27 internal nodes.
##
## Tip labels:
## Acantharchus_pomotis, Lepomis_gibbosus, Lepomis_microlophus, Lepomis_punctatus, Lepomis_miniatus, Lepomis_auritus, ...
##
## Rooted; includes branch lengths.
I'll also pull out the character feeding.mode
from my sunfish data:
fmode<-setNames(sunfish.data$feeding.mode,rownames(sunfish.data))
Now we're pretty much ready to go. The function to.matrix
in phytools
just converts a factor or character vector into a binary matrix. It works
as follows:
fmode
## Acantharchus_pomotis Lepomis_gibbosus Lepomis_microlophus
## pisc non non
## Lepomis_punctatus Lepomis_miniatus Lepomis_auritus
## non non non
## Lepomis_marginatus Lepomis_megalotis Lepomis_humilis
## non non non
## Lepomis_macrochirus Lepomis_gulosus Lepomis_symmetricus
## non pisc non
## Lepomis_cyanellus Micropterus_coosae Micropterus_notius
## pisc pisc pisc
## Micropterus_treculi Micropterus_salmoides Micropterus_floridanus
## pisc pisc pisc
## Micropterus_punctulatus Micropterus_dolomieu Centrarchus_macropterus
## pisc pisc non
## Enneacantus_obesus Pomoxis_annularis Pomoxis_nigromaculatus
## non pisc pisc
## Archolites_interruptus Ambloplites_ariommus Ambloplites_rupestris
## pisc pisc pisc
## Ambloplites_cavifrons
## pisc
## Levels: non pisc
to.matrix(fmode,levels(fmode))
## non pisc
## Acantharchus_pomotis 0 1
## Lepomis_gibbosus 1 0
## Lepomis_microlophus 1 0
## Lepomis_punctatus 1 0
## Lepomis_miniatus 1 0
## Lepomis_auritus 1 0
## Lepomis_marginatus 1 0
## Lepomis_megalotis 1 0
## Lepomis_humilis 1 0
## Lepomis_macrochirus 1 0
## Lepomis_gulosus 0 1
## Lepomis_symmetricus 1 0
## Lepomis_cyanellus 0 1
## Micropterus_coosae 0 1
## Micropterus_notius 0 1
## Micropterus_treculi 0 1
## Micropterus_salmoides 0 1
## Micropterus_floridanus 0 1
## Micropterus_punctulatus 0 1
## Micropterus_dolomieu 0 1
## Centrarchus_macropterus 1 0
## Enneacantus_obesus 1 0
## Pomoxis_annularis 0 1
## Pomoxis_nigromaculatus 0 1
## Archolites_interruptus 0 1
## Ambloplites_ariommus 0 1
## Ambloplites_rupestris 0 1
## Ambloplites_cavifrons 0 1
That's OK. Now let's use it with tiplabels
to plot piecharts for the tips
of the tree in which we color by the tip state of each species. Note that
I'm going to want to order the rows of my matrix to match the tip order of my
plotted tree:
plotTree(sunfish.tree,ftype="i",offset=0.6,fsize=0.9)
FMODE<-to.matrix(fmode,levels(fmode))
FMODE<-FMODE[sunfish.tree$tip.label,]
tiplabels(pie=FMODE,piecol=palette()[c(4,2)],cex=0.6)
legend("topleft",levels(fmode),pch=21,pt.bg=palette()[c(4,2)],
pt.cex=2.2)
Of course, we didn't need to do it that way. We could have also just plotted points at the tips of the tree as follows:
plotTree(sunfish.tree,ftype="i",offset=0.6,fsize=0.9)
obj<-get("last_plot.phylo",envir=.PlotPhyloEnv)
cols<-setNames(palette()[c(4,2)],levels(fmode))
points(obj$xx[1:Ntip(sunfish.tree)],
obj$yy[1:Ntip(sunfish.tree)],pch=21,cex=2.2,
bg=cols[fmode[sunfish.tree$tip.label]])
legend("topleft",levels(fmode),pch=21,pt.bg=palette()[c(4,2)],
pt.cex=2.2)
Neat!
How the heck did that work?
Well, first of all get("last_plot.phylo",envir=.PlotPhyloEnv)
gets a
special environmental variable that is created by plotTree
(and many other
phylogenetic plotting functions - it is based on ape::plot.phylo
) with
lots of different attributes of the most recently plotted tree - including
the coordinates of all the nodes in the tree in
get("last_plot.phylo",envir=.PlotPhyloEnv)$xx
and
get("last_plot.phylo",envir=.PlotPhyloEnv)$yy
.
Next, when we create a vector of colors in which the names are the different
states for our character (as we did using
setNames(palette()[c(4,2)],levels(fmode))
), then the simple call
cols[fmode[sunfish.tree$tip.label]]
gives us a vector with the colors
that correspond to the states at all the tips of the tree - in the order
of the tips.
Not bad.
Dear Liam,
ReplyDeleteNice blog. I have a terrible question for you. I am trying to use as.matrix as you suggested instead of using ape. I have a number of species with two discrete traits (like in the example with the fish). I use the setNames function as suggested but the as.matrix step after fails to work. This is what I get as an error message:
"Flocking" "Flocking"
Sheppardia_sharpei Spermophaga_ruficapilla
"Flocking" "Flocking"
Turdus_olivaceus Vermivora_chrysoptera
"Solitary" "Flocking"
Zosterops_conspicillatus
"Flocking"
> to.matrix(flock,levels(flock))
Error in `[<-`(`*tmp*`, x == seq[i], i, value = 1) :
subscript out of bounds
Any idea what is wrong here? Cheers,
-Guy Beauchamp
Dear Guy. It looks like your data are of mode character instead of factor. You could try x<-as.factor(x). When you use to.matrix with tiplabels, also remember that you should sort your matrix by the order of the tip labels in the tree. For instance to.matrix(x[tree$tip.label],levels(x)).
DeleteGood luck.
I'm having a similar 'subscript out of bounds' error as the user above, my setup is as follows:
Deletetree75<-read.tree("Concatenated_Tree_75_Tr1.tre")
chardat<-read.csv("D:/TrimmaASR/trimma_dat.csv", sep=",", row.names=1)
fmode<-as.factor(setNames(chardat$Char..1, rownames(chardat)))
FMODE<-to.matrix(fmode, levels(fmode))
plotTree(tree75, ftype="i", offset=0.6, fsize=0.35, lwd=1)
FMODE<-to.matrix(fmode, levels(fmode))
FMODE<-FMODE[tree75$tip.label,]
And the last line returns the 'subscript out of bounds' error. Removing the comma in the last function after tree75$tip.label does not return the error but continuing through this example with my data does not produce pie charts for the trait data at the tips. Looking at your comment above I can't figure out where to place those functions in my own code, I am working with discrete binary presence/absence data.
Best, Dan S.
Hi Dan S. Most-likely, your tree contains some labels that are not present in your data frame. If you prune the tree so that it only contains species represented in your data, this should work. If you're not sure why there might be a mismatch, I suggest checking our geiger::name.check. Best wishes, Liam
DeleteThanks Liam, that helped. Now another issue I'm having is that I'm want to perform ancestral state reconstruction on trees generated in ASTRAL but when I load the data in with read.astral from the treeio package the branch lengths become vectors of NAs, resulting in a basic cladogram and not being able to map states on to the tips.
DeleteHi Dan. Can you read in the ASTRAL tree (without the annotation) using one of the standard parsers, such as ape::read.tree or ape::read.nexus? -- Liam
Deletethat's brill, thanks so much for this. i would now like to change my species names to another variable (a factor which has levels 'extra large', 'large', 'medium', and 'small'). I'd ideally like these in place of the species names whilst keeping the colours of my binary factor on the tips. Could you advise?
ReplyDelete