Please feel free to post suggestions, questions, or comments about the blog material, my R functions page, or phylogenetic comparative methods generally.

So far as I can tell, the above comment is spam - but since it is the only one I've got (as of 10/3/2011) on a whole page dedicated to comments, I have chosen to leave it up.

Hi Liam, I've been applying evol.rate.mcmc to some data and am trying to get my head around the analysis. From what I understand it finds a single rate shift in rates of phenotypic evolution within a tree. Is there any way to test the 'significance' of any identified shifts (i.e. especially for more subtle identified shifts)? I hope that question makes sense, thanks for any advice. regards

Ha! Here's a real comment for you: I am having a hard time using the phyl.pca function in phytools. No matter that I am passing the mode "cov", I get an error. > pca<-phyl.pca(C, Y, method="BM", mode="corr") "Error in colSums(invC %*% X) : error in evaluating the argument 'x' in selecting a method for function 'colSums': Error in invC %*% X : requires numeric/complex matrix/vector arguments. Any ideas?

Is C the vcv matrix from the tree? phyl.pca (unlike the function phyl_pca published in the appendix to Revell 2009) takes the tree not the vcv matrix as input, so you would do:

> pca<-phyl.pca(tree,Y,mode="corr")

If this is not your problem, I would be happy to troubleshoot any issues. If it is, then sorry for the confusion!

Dear Liam, Your blog is very interesting. As you work with Simmap, I would have a very basic question: I want to do stochastic character mapping on 10000 Bayesian trees in Simmap and display the results on the consensus tree. What output files do I have to use - is it the mutational maps, which can be saved optionally? Do I have to summarize all individual maps to display them on the consensus? I had no problem to make ASR in Simmap, but after checking the manual I still don't understand how to get the output for stochastic character mapping and how to display the results. Do you have any hint for me? Thanks, David

I have nothing to do with the program SIMMAP which is maintained by Jonathan Bollback. The phytools function make.simmap can do basic stochastic mapping. Other functions (read.simmap, write.simmap) in the phytools package read & write trees in the SIMMAP format.

In the future I hope to add some of the functionality you are talking about to the phytools package, but I am not there yet.

I am a PhD student of Vladimir Minin's at UW, and we have noticed some potentially undesirable behavior with the optim.phylo.ls function.

What we observed is that there appears to be some stochasticity when using the "fixed=TRUE" option; that is, it would sometimes return different answers when run repeatedly with the same input tree and sequence alignment.

The culprit appears to be the multi2di function within optim.phylo.ls. The multi2di function purports to resolve polytomies by explicitly denoting them as having branches of length 0, as opposed to a lack of a branch whatsoever. However, it is self-admittedly a stochastic procedure, in terms of exactly where it adds branches -- it seems that it should not matter in principle since it is merely adding branches of length 0. However, there appear to be some undesired consequences when each of these resolved trees are used in optim.phylo.ls

As a demonstration, consider the following multifurcating tree: http://i16.photobucket.com/albums/b37/peterchi/T1.png

Now, suppose we want to use this as the fixed topology on which to estimate the least squares branch lengths for a given sequence alignment. Let's say that we have saved the matrix of pairwise distances (according to JC69) as dist2.stan. Then, we would use the following R code:

However, if we run this repeatedly, we will observe that we obtain different answers between iterations: > T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE) best Q score of 0.0040827015 found after 0 nearest neighber interchange(s). > T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE) best Q score of 3.27601e-05 found after 0 nearest neighber interchange(s). > T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE) best Q score of 0.0040827015 found after 0 nearest neighber interchange(s). > T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE) best Q score of 0.0033840255 found after 0 nearest neighber interchange(s). > T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE) best Q score of 3.27601e-05 found after 0 nearest neighber interchange(s).

As mentioned above, the cause of these discrepancies appears to be the multi2di function. Observe what happens when we repeatedly plot the multi2di(T1) result; that is:

We then get the following trees: http://i16.photobucket.com/albums/b37/peterchi/multitrees.png

And, although they clearly are all the same unrooted tree, we can examine the structure of how each tree is stored (i.e. its edges), as follows: http://i16.photobucket.com/albums/b37/peterchi/edges.png

It thus seems to be this randomness that is causing the different resulting Q-scores and optimized branch lengths from a fixed topology.

We have been able to come up with a solution for our own purposes, but we just thought we would bring this observation to your attention.

Yes, I can help you with this. I think that, by convention in "phylo" objects, nodes closer to the root node on a given path from the root have smaller node numbers then do nodes on the same path that are farther from the root. That is to say, node 13 is not necessarily closer to the root than node 19; but, if for a set of four taxa, the pairwise MRCAs (excluding duplicates) are: 16, 15, & 14, then we know that MRCA of all 4 is 14.

Just in case this is not guaranteed to be the case, I have written a function that uses ape::mrca & phytools::nodeHeights to compare the pairwise MRCAs of a list of species to the heights of the MRCAs above the root, and then picks the one with the lowest height. I will post it shortly.

a simple question, how in trees mapped with "make.simmap", you can get the heights in the tree in which there are transition points between mapped states?

I am using your anc.Bayes function in phytools with a nexus tree and continuous data. When I run the line "anc.Bayes(tree,data,ngen=10000,control=list())" I get an error message:

"Error in if (post.odds > runif(n = 1)) { : missing value where TRUE/FALSE needed"

Do you know what this might be? I've had no problem with my tree or data in geiger or ape.

My guess is that in your during the MCMC one of: 1) the log-value of the likelihood for the current state of the chain; 2) the log-value of the likelihood for the proposed state; or 3) either of the log-priors; are numerically indistinguishable from -Inf or Inf. One thing you could do to check this is rescale your data by multiplying by 10 or 1/10. In addition, if you send me your dataset I would happy to try and diagnose the issue - and it would also be helpful in improving phytools.

Hi Liam, Suggestion for you. How about adding % variance explained for the PC axes as part of the output of phyl.pca? Not hard for someone to do separately, but it would be useful output to add. I'm digging phytools.

Hi, could you help me out with this? I'll just post my entire script-run.

> rm(list = ls(all = TRUE)) > setwd("M:\\pc\\Desktop\\Syntese MANUS\\Nye MHC verdier") > library(ape) > tree<-read.nexus("tree.nex") > exon<-read.table("exon.txt") > attach(exon) > X<-as.matrix(exon) > class(tree) [1] "phylo" > str(tree) List of 4 $ edge : int [1:14, 1:2] 9 10 10 11 11 12 12 9 13 13 ... $ edge.length: num [1:14] 1 1 1 1 1 1 1 1 1 1 ... $ Nnode : int 7 $ tip.label : chr [1:8] "Ficedula_hypoleuca" "Hylocichla_mustelina" "Luscinia_svecica" "Oenanthe_oenanthe" ... - attr(*, "class")= chr "phylo" - attr(*, "origin")= chr "tree.nex" > class(exon) [1] "data.frame" > str(exon) 'data.frame': 8 obs. of 6 variables: $ V1: Factor w/ 8 levels "Ficedula_hypoleuca",..: 1 2 3 4 5 6 7 8 $ V2: int 112 175 154 149 114 160 125 162 $ V3: int 159 259 241 228 167 243 183 257 $ V4: num 0.163 0.208 0.18 0.175 0.154 0.205 0.176 0.154 $ V5: num 0.134 0.227 0.179 0.146 0.136 0.164 0.18 0.13 $ V6: num 0.202 0.257 0.219 0.22 0.184 0.194 0.209 0.188 > class(X) [1] "matrix" > str(X) chr [1:8, 1:6] "Ficedula_hypoleuca" "Hylocichla_mustelina" "Luscinia_svecica" ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:6] "V1" "V2" "V3" "V4" ... > source("phyl.pca.R") > pca.results<-phyl.pca(tree,X,method="lambda",mode="corr") [1] "Y has no names. function will assume that the row order of Y matches tree$tip.label" Error in invC %*% X : requires numeric/complex matrix/vector arguments > pca.results Error: object 'pca.results' not found

If you could help me out, I would be forever grateful.

PS, I saw the similar issue in the other comments, but seeing as "tree" is simply that, a phylogenetic tree and not a distance matrix, I don't get why the script returns this error.

Dear Liam I am new to this field and have a very question. I want to extract the species names (tip.label) of a tree. I downloaded this tree from one on the articles. It seems that this tree is a combine tree. If you help me how can I extract the species names. The str of the tree is: Class "multiPhylo" List of 3 $ tree mammalST_MSW05_bestDates :List of 4 ..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ... ..$ Nnode : int 2503 ..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ... ..$ edge.length: num [1:7522] 102.6 49.9 13.7 0.2 13.5 ... ..- attr(*, "class")= chr "phylo" $ tree mammalST_MSW05_lowerDates:List of 4 ..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ... ..$ Nnode : int 2503 ..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ... ..$ edge.length: num [1:7522] 114.2 40.5 11.5 0.2 11.3 ... ..- attr(*, "class")= chr "phylo" $ tree mammalST_MSW05_upperDates:List of 4 ..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ... ..$ Nnode : int 2503 ..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ... ..$ edge.length: num [1:7522] 91 59.3 15.9 0.2 15.7 15.7 75.2 13.1 44.4 8.4 ... ..- attr(*, "class")= chr "phylo".

The tip names are in the vector $tip.label. You can pull these out from each tree in the 'multiPhylo' object using (say) tree[[1]]$tip.label (for the 1st tree, for example).

Dear Liam, I am trying your phyl.cca function but I keep getting the following error: Error in ccs > 0 : invalid comparison with complex values is this a format problem with the data matrices or something else is going on? thanks for your reply and for developing this awesome 'phytools' cheers Arley

Hi, Liam, Like the above individual I'm also trying to use the phyl.cca function, but have been encountering three main issues: 1. Although I have the latest versions of R and phytools, R cannot find the function. It's definitely spelled right... 2. When I source the R code instead, the lambda estimate statement doesn't seem to register (the exact error: unused argument(s) (fixed = F)). I'm guessing this is because the code posted 9/7/2011 doesn't include updates to use lambda. 3. If I take out the lambda part, I get a "subscript out of bounds error," which I know must have something to do with matrices. I've tested the format with is.matrix and the rownames are identified (furthermore, I've successfully used phyl.pca, which has similar format requirements). Any suggestions? I'd be happy to send my code and files if needed. Thanks for taking a look at this. I enjoyed the comparative methods symposium at Evolution in Ottawa! Mercedes Burns

I have actually fixed the problem I had with phyl.cca I just had to re-order the rownames in the data matrices to match the order of the taxa in the tree, and it worked like a charm! For the record, I'm using version 0.1-8. arley

The first thing I would do is install the latest version of phytools. You should first detach phytools or (better yet), start a new R session. You can then update the package using update.packages() or install.packages("phytools"). Try to run it again and if you are not able, please let me know.

I'm glad you got this to run, but the report that the order matters concerns me. Did the rows have names beforehand? If they are named (in theory) it should not matter - so I should investigate this!

Thanks for your response, Liam! Issues 1 and 2 are resolved, but I'm still getting the subscript out of bounds error. I'll try Arley's solution of reordering the rownames, but yes, let us know if there's a fix! Mercedes

Hmm, the reordering rows trick didn't work for me. I get a subscript out of bounds error when I run: data<-phyl.cca(tree,X,Y,fixed=F) X is a matrix with 5 columns, Y is a matrix of 2 columns. The rows are identical in number and name. Any thoughts? Thanks again. Mercedes

Hi Mercedes. Hard for me to diagnose, but if you send me the input files, I can try and troubleshoot. This will also help improve phytools if there is a bug. liam.revell@umb.edu - Liam

Using brownieREML(), I am testing whether the evolutionary rate of a continuous trait (y) is contingent on a discrete multistate trait x <-c("a","b","c").

The evolutionary history of the multistate trait was reconstructed with the function make.simmap(x, model=c("SYM")). However make.simmap assumes that transitions are symmetric (model="SYM"), such that transitions from "a" to "b" are as likely than from "b" to "a". However i don't have a biological justification to assume this a priori. So, I wonder if there is a statistical justification for this assumption or should i try other models (e.g. model=c("ARD"). Unfortunately, results change when i do the latter.

Many thanks for you time and help, and congratulations for this helpful Blog.

If your results seem sensitive to the mapping method, you might also consider running stochastic mapping in the stand-alone program SIMMAP. It can do things not presently implemented in phytools, including sampling across model parameter values. You can export your data & tree using the phytools function export.as.xml (see ?export.as.xml for details) which will create an input file for SIMMAP.

Thanks Liam. I modified the ace() function to incorporate a designed matrix, for which i have some biological support. This seems to work well with make.simmap(). Cheers, Gabriel

Dear Liam, I am wondering, is it possible to give different colours to the taxa after pruning by using the drop.tip. I just want to show the locations of the used species in the whole tree. All the taxa in tree retains but with different colours? (e.g. used species blue or red and pruned ones grey)

Dear Liam Your fancyTree command is really cool and just what I needed. I am very new to this field and please excuse me for my silly question. I want to use this command but I have no idea how I can select the species that I want to prune. I have a super tree with thousands of species and there is my data set with few hundreds species. Now, I am not able to figured it out how to tell which species to retain and which to prune.. Could you help me in this regard.

Hi. If just want to, say, prune your tree to contain only the species represented in your dataset you can just do (assuming your data are in matrix X, and the row names contain species names): tips<-setdiff(tree$tip.label,rownames(X)) ptree<-drop.tip(tree,tips) Or, if you want to visualize it using fancyTree: tips<-setdiff(tree$tip.label,rownames(X)) ptree<-fancyTree(tree,type="droptip",tip=tips) - Liam

Dear Liam, thanks for the reply. I tried the code but the last line of the code, ptree<-fancyTree(tree,type="droptip",tip=tips) is not working. it gives the following message Error in plot.window(...) : need finite 'xlim' values I don't know why is not working.

Is there any possibility that the vector, tips, is either empty or contains all the tips in the tree (or, say, all but one). You may also want to update to the latest version of phytools to make sure that you have the newest version of the function.

Hey Liam, Do you have, or do you know of, a function that will do a phylogenetic MANOVA in which the canonical axes are evaluated while taking covariance due to phylogeny into account? I'm looking for something analogous to phyl.pca. I am aware that GEIGER has a function (phy.manova) to do a standard MANOVA and compare a test-statistic to a null distribution based on simulation under an empirically determined single rate matrix (ala Garland et al. 1993), but this approach does not provide phylogenetically informed canonical axes. I would like to know what variables are evolving differently in two groups of taxa. Also, neither group is monophyletic.

Unfortunately, I don't have a good answer. nlme::gls does not allow multiple response variables. Also, so far as I know, this has not been done with phylogenetic data - which makes me suspicious that simply extending generalized regression/ANOVA to multiple continuous dependent variables might be more complicated than it sounds. You could try the SIG-phylo list....

I would like to run a pgls analysis where I account for variation in the estimation of my species means. In other words, my response variable is a mean value calculated for each species in my phylogenetic tree. However, the number of samples used to calculate these means differs among the species in my tree. Is there any way to include a weighting factor that accounts for this variation (something like the square root of the sample size) in a pgls context? Do you have any suggestions for how to code this in R?

Have you tried the phytools function pgls.Ives? For this function, you need sampling variances on your estimates of the means for each species - which in this case would be the within-species variance divided by the sample size for each species.

If you do not have within-species variances, just the mean & sample size, please let me know as I'm pretty sure we could fit the Ives et al. (2007) model, but assuming that within-species variance was fixed across species and then estimating it. Obviously, it's better if you do know within species variances.

Note that with pgls.Ives you have two options for input: you can give the function the species means, variances of the means in x & y, and sampling covariance between x & y; or you can just feed the function with the raw data for all individuals - as long as the ordering is the same for x & y.

Unfortunately, so far I've only implemented the bivariate phylogenetic regression in this function.

Many thanks for your reply. I can calculate the within species variance, but have the problem that for some species I only have one observation (n per species ranges from 1 to 32, and I have 14 species in my analysis). I can try and implement the solutions that you suggest in an earlier blog (http://blog.phytools.org/2012/06/blombergs-k-phylogenetic-signal-with.html), but what do you consider to be a suitably homogeneous intraspecific variance to make this option a possibility? In my case it ranges from less than 1 to nearly 40.

Hi Nick. The within-species variances for a trait vary 40-fold? A couple of quick questions leap to mind:

(1) This is the variance, not the sampling variance, correct? The relationship is sampling variance = variance divided by sample size. phylosig() takes the sampling variances as input, but when we only have one (or very few) observation(s) per species, I suggest using the mean within-species variance as the sampling variance for those species (or the mean/pooled variance divided by the species-specific sample size). When we have only 1 sample, of course, the within-species variance and the sampling variance of the mean are the same.

(2) Are the data on a linear or log scale? Usually log-transformation will homogenize the variance. If your within-species variances (not sampling variances, remember) differ 40-fold, then it is probably because the trait varies widely among species as well and there is a relationship between the variance and the mean. Log-transformation will usually get rid of this.

I look forward to hearing more details about this and I will post the solution to the blog if it is likely to be useful for or informative to others.

Dear Liam, Just a small question. To you it might be a very simple question but I am not able to sorted it out and also I may not ask the question correctly but still asking it. I am trying to show the distribution of contineous trait data over the phylogenetics tree in colours. something like, 'tiplabels(pch=14, col=mycol(Trait data), cex=0.8)' but this gives me colours only at the tips as it should but if I want to colour the entire branch, how can I do it. Something like the background of your webpage with a lot of coloured trees. Thans

Check out the phytools function contMap and let me know if this is what you have in mind. If you have any difficulty figuring out how to use it, please let me know and I would be happy to help. - Liam

Dear Liam, Thanks for the reply. Yes, I am looking for this kind of function. I tried this function but it is giving me an error, "Error in while (x > trans[i]) { : missing value where TRUE/FALSE needed" Though "tiplabels" function is working fine. I must add , before using this function, I am also using the "drop.tip", function as well. Thanks

Dear Liam, thank you very much for this help. Ok, first I will try myself to sort it out if I am not able to plot tree then will surely send my data and tree to you. Thank you very much once again. In both cases, I will let you know.

Have you written any functions, or are you planning to, to simulate not just trait means, but trait variances for species? E..g, add trait variation as a output value for each tip along with the mean in fastBM.

I have a quick question: how can you assign states to nodes in a phylogeny, based on reconstructions. So lets say I have the data from my reconstruction, I want to say:

Node 16 <- 0 Node 17 <- 1 Node 18 <- 1

Such that I can list the descendants from any given node, and see if they differ from the node which I started with, hope that makes sense!

Dear Liam, I have used make.simmap to obtain 1000 stochastic maps for a larger clade. It contains two sister-clades. My main interest is to compare transitions for these two sister -clades. describe.simmap gives average values for the entire clade, while, countSimmap provides number of transitions for 1000 trees. Is there a way to obatin these values individually for sub-clades within phytools? or Do i have to run the analyses separately for each sister-clade to obtain these values?

I contacted you by mail, but the blog may be a nice channel for ideas exchange as well, I believe*.

I'm writting some code to actually simulate geographic dynamics using this CTMC framework. I implemented my code using geiger, but now that I've delved into phytools documentation and also this blog, I'm quite tempted to re-do my stuff to use phytools capabilities. I like the way you can do it all (simulate, fit and estimate) in a very clean and fast way. Since I'm planning a serious simulation study, this seems rather fit.

Do you think it is possible to incorporate an asymmetric Q, provided the restrictions for detailed balance (upon exponentiation) hold?

Cheers and thanks for providing,

Luiz

*I'm sorry to have posted this outside the general comments section, the first time.

I am asking this question here in the hopes it might be useful to others. Is it valid to compute partial R^2 values when you have multiple predictors in a PGLS or pGLM situation?

I'm wondering, do you know of a way of quantifying the amount of variation in reconstructed trait values in ancestral vs. younger nodes (in R), i.e. answering the question: "When in evolutionary history did change occur in said trait?"?

I've reconstructed values and all, I just do not want to have to calculate edge values by hand for a 200spp phylogeny :)

I am running the phy.cca to command of phytools for my dataset ( which I have sent to you by e-mail. Name: Nachiket Shankar ; Subject: phl.cca) I constantly get this error - and am unable to figure out the problem!

Error message: Error in solve.default(kronRC, y - D %*% a) : Lapack routine dgesv: system is exactly singular: U[1,1] = 0 Could you please help me out with this?

I just ran some of the code you sent me. It looks like you are trying to do CCA with only one x & one y? This is not the purpose of of CCA, which is designed for more than one x & y (essentially a multivariate correlation, see description here). You may just want a regular (phylogenetic) correlation, although it is a little difficult for me to recommend that without more information about your data & problem.

Is there a way to 'slice' a mutational map into successive time frames (or eras) and tally character state changes within each slice as a unit? I am curious to know if there is a method that allows one to estimate the amount of character change over units (bins) of time.

I am trying to test the effect that constraining the root state has on the number of transitions by setting pi = c(0,1) or even pi = c(.001, .999) but it still results in the root being of state 1 not state 2. Is it possible to force the root to be a particular state?

Just a question about computation of single rate BM in Brownie.lite. Why did you use optimtimization of the likelihood instead of computing directly the MLE? There is cases where the results could differ?

Hi Liam, I've been using densitymap function to summarize mutational maps made in SIMMAP. I have been unable to obtain the posterior probability values for each branch of the density tree as shown in the plotted tree. Is it possible to extract these values when running densitymap?

this brings me to my second question. I have a continuous trait mapped for the same tree. For each branch of the tree this trait takes a value between 0 and 260 (only integer numbers). Is there a way to estimate the correlation between this trait and the probability values from densitymap? Note that this will be a correlation between values in the internal branches, not with tip values. Is this analogous to a correlation between ancestral states?

brownie.lite is throwing me an error and I cannot track down why. When I enter " H.l.bm <- brownie.lite(Nym.pruned$phy, Hosts, maxit=1000)" it runs for 10 sec and then: Error in 1:m : argument of length 0

Nym.pruned$phy is a an object of class "phylo" with a mapped discrete character and Hosts is a vector with names corresponding to the species names in Nym.pruned$phy? If that the case, then let me know and feel free to send me your data (or .RData file) so I can investigate.

Dear Liam, I just discovered your blog, it's a pleasure! I have a question, maybe you have an idea?

I try to generate all trees being at RF distance of, let's say, 2 from a given tree. Do you have any idea on how this could be implemented and if this seems something complicated or not?

Hi Liam, thanks for both the great blog and phytools :o) I was wondering if I might ask you for your help. I am trying to test for phylogenetic signal in my data using Pagel´s lambda (phylosig function). I have the latest version of both R and phytools. I import the phylogeny via read.nexus command. I even sorted the species data so that they are in the same order as phylogeny. However, I end up with this error message every single time:

"some species in x are missing from tree, dropping missing taxa from x" [1] "some species in tree are missing from x , dropping missing taxa from the tree" Error in vcv.phylo(tree) : the tree has no branch lengths In addition: Warning message: In drop.tip(tree, setdiff(tree$tip.label, names(x))) : drop all tips of the tree: returning NULL

the same data and phylogeny work just fine in other analyses - PIC, calculation of K through Picante multiPhylosignal. Do you have any idea what might be the reason? I assume it is some silly mistake I am making, but have hard time figuring it out. Thanks in advance for any reply, Katerina

Hi, I'm having the same issue with my phylogeny tree, and I was wondering if you figured out what was wrong and how to fix it. My tree does have branch length, but I keep getting that same error when running phylosig. Thanks. Onja

I'm using the evol.rate.mcmc, minSplit and posterior.evolrate functions to calculate and plot splits in trait evolution rate on a phylogeny. I have no problem with the evol.rate.mcmc and the minSplit functions, but when I use the posterior.evolrate function I the following error pops up:

pevol.flor<-posterior.evolrate(mytree, para.m.flor$bp, m.flor$mcmc, m.flor$tips, showTree=TRUE) Error in split$node : $ operator is invalid for atomic vectors I also tried pevol.flor<-posterior.evolrate(mytree, 0.0001211563, m.flor$mcmc, m.flor$tips, showTree=TRUE) Error in split$node : $ operator is invalid for atomic vectors

I check whether all object I use as arguments are what they should be (matrix, list, etc, that is all ok). Can you help me please?

I am very new to this field and when I am trying to get phylogenetic distances or with other phylogenetic commands I get a similar error as mentioned above by Marina. Input; - t1<-read.tree(text=aa) - sample<-read.table(file="densityplotspecies.txt") - pd("sample","t1",include.root=TRUE) (with aa as a phylomatic newick output)

I tryed a lot of different things but I keep ending up with this error: Error in tree$edge.length : $ operator is invalid for atomic vectors

I'm trying to find a function that returns the node number of the most inclusive clade containing some tips, but not others. For example, for the following tree (((1,2),((3,4),((5,6),((7,8),(9,10)))) I want the most inclusive clade containing tips 7 and 9, but not 3. The answer would be the clade containing 5,6,7, and 8,9,10. Thanks in advance for any help you can provide. Joey

I am using PGLS to examine scaling among morphological characters. I wonder if you have advice on the most appropriate way to test for significant differences between the observed PGLS slope and isometry?

Is it advisable to constrain the slope to isometry and then compare the observed and constrained slopes using a t-test?

Hi Liam, I am looking for a way to do a phylogenetically independent PCA that accounts for intraspecific variation within a species. I thought I could indicate members of the same species as a polytomy with branch lengths=0. As a test, I created a tree with one branch that has length=0. I then ran phyl.pca and it crashed R. Any suggestions? There must be a better way to do it. -Will

Yes, this will not work because the phylogenetic covariance matrix among species in this case will be singular.

One tactic is to use species means to get the rotation, and then rotate the individual data using the eigenvectors from the among-species PCA. This allows you to obtain PCA scores from the phylogenetic PCA for all individuals in your analysis, but it does not take into account intraspecific variation in performing the phylogenetic PCA. - Liam

I have been trying to use ancThresh for a while. Once the previous issues were solved, now I get this, > mcmc<-ancThresh(tree,X,ngen=1000,control= list(piecol=cols,tipcol="estimated"),label.offset=0.01) **** NOTE: no sequence provided, column order in x MCMC starting.... gen 1000 Error in burnin:nrow(mcmc) : argument of length 0 Could you please provide me with some help?

This is probably because you need to increase ngen above the default (or decrease the sample interval). The default setting for the number of generations of MCMC to run, the sample interval, and the default burn-in are mutually incompatible! Try increasing ngen=100000 and see if that works.

congratulations on your blog! I've been trying to use Rphylip and hit an obstacle:

While phylip is installed and properly executed via the terminal (Ubuntu Linux), its path cannot be found in R. I observed you use the function findPath() to obtain it, however it throws an error in R:

> findPath() Error: could not find function "findPath"

Am I doing something wrong? Should the function be there? Any help would be appreciated.

findPath is an internal function in Rphylip, not part of the namespace. To see its guts, you can type:

Rphylip:::findPath

at the R command prompt.

If Rphylip is not working in Linux it probably because it cannot find where you have put the PHYLIP executables. This is not too surprising as findPath only searches in places where Windows & Mac OS users would normally put these files. For any R session you can use the Rphylip function setPath to set the path - and then you will not have to set it again. For more information about how to use setPath type:

?setPath

at the R command prompt.

Let me know if this helps. Sorry for the slow reply!

I am trying to reconstruct the ancestral state of a discrete genic characteristic across many loci. For each locus I am randomly sampling 100 trees from the posterior, performing 10 stochastic maps for each sample, and then outputting using the describe.simmap function. In my gene trees I have strong prior knowledge that one particular species should be the outgroup. Is there a way to assign an outgroup to the outputted tree using describe.simmap?

While I have your attention, is there a clever way to summarize ancestral state reconstructions across multiple loci? I am having trouble conceptualizing a method to so because the character I am reconstructing is one of the gene and not the species on a whole, and the phylogenys that I am reconstructing on will most likely have slightly different topologies. One summary statistic I can think of is to plot out the distribution of transitions from one state to another. But, that is all I got.

Any help, or pointing in the right direction would be greatly appreciated!

Hi, I'm trying to run Pagel's correlation method. However I'm having some trouble. I followed all code posted here (http://blog.phytools.org/2014/12/r-function-for-pagels-1994-correlation.html) and everything goes nicely. But when I use my own data:

library(phytools) source("fitPagel.R") .check.pkg<-phytools:::.check.pkg m <- read.table("matriz",row.names=1) tree <- read.nexus("tree.nexus") x <- m[,1] y <- m[,2] names(x) <- rownames(m) names(y) <- rownames(m) fit.ape<-fitPagel(tree,x,y)

I get the following error:

Error in matexpo(Q * EL[i]) : NA/NaN/Inf in foreign function call (arg 1) 3 matexpo(Q * EL[i]) 2 ace(xy, tree, type = "discrete", model = iQ) at fitPagel.R#30 1 fitPagel(tree, x, y)

Luis. matexpo is in ape. You could try installing geiger & running method="fitDiscrete". If that doesn't work, I would be willing to try to run your data and see if I can figure out the problem. - Liam

I am trying to run the "aov.phylo" function in geiger and am getting the following error: Error in aov.phylo(matrix ~ grp, tree, nsim = 50, test = "Wilks") : 'formula' must be of the form 'dat~group', where 'group' is a named factor vector and 'dat' is a data matrix or named vector.

Hi Liam, thank you very much for sharing all these useful information with the scientific community and for the time and support you devote and provide.

I have a couple of questions about the way in which the package caper handle the "relevel" and "anova" function in R. These work fine in other packages such as ape but when i try a PGLS analyses in caper i get some issue. The "relevel" function runs fine but the output table comes out with the original values in terms of intercept of reference. When i ask for an anova table instead i get this:

Error in terms.formula(formula, data = data) : invalid model formula in ExtractVars

Any idea about what i may be doing wrong? Thank you for your time.

As a new user, I need to simulate trees using phytools and wondering what would be the best resource in order to figure out how should be manipulating different tree parameters?

Hi Camila. It is not, but if you have estimated ancestral states from a different function, it is possible to use contMap to plot them. There is an explanation of how to do this here: http://blog.phytools.org/2014/08/droptip-method-for-object-of-class.html. If this is what you want to do, let me know and I may write another post about it or built a function to make it easier. - Liam

I'm trying to apply regressions, ANOVA and ANCOVA using the pGLS. When I am using discrete predictor on O-U model (corMartins), the following problem appears for most of the associations I tried to fit:

There are several functions available for simulating rate shifts in trees (e.g., SimTree, TESS), but these implement tree-wide shifts, which are somewhat unrealistic, rather than shifts descending from a particular node. Is it possible to implement a more realistic rate shifted tree? Do you have any ideas on how it might be done?

This wouldn't be too hard to do; however I'm surprised that it is not out there. Perhaps you should post your query to the R-sig-phylo email list as if it has already been implemented in software, someone on that list will surely know.

I am have issues with some code that can potentially be a bug - some of the colors are printing in the wrong order. I am attempting to plot colors by states on a pie graph for node labels. I used the two codes below. For the second code, if I list the names in alphabetical order of trait, it plots correctly.

On a separate note, I was wondering if there's anyway to use densityMap for more than one state? I have 9 discrete states and a tree with 1600+ tips. The plot with pie charts plotted comes out a bit messy since there are so many nodes. When I plot the tree as a fan it appears a little better, but still a bit cluttered.

Thanks very much for the amazing blog! It has made my brief foray into phylogenetics a lot more bearable!

I have a question about testing significance of the slope for pgls.Ives. I know this has come up before and the suggestion was to constrain the slope to 0 and run a likelihood ratio test. This all seems fine so far. As far as I can tell, you can specify the upper and lower bounds of the slope to be zero in the optim function. However, this then fails due to non-finite difference values, I think in the likelihood function. Is there a way round this? You can set the slope to be close to zero, but not actually zero. All of this could be completely wrong however, so any help would be much appreciated!

Hi Liam, I am simulating correlated phylogenetic data by using fastBM and then applying the correlation structure by multiplying the data matrix by the cholesky decomposition of my correlation matrix (M) to generate a new data matrix (newdata). To test that the results make sense, I calculate the phylogenetically independent vcv matrix using the method that you describe in your 2009 paper on phylogenetic size correction and pca. Once I have the vcv matrix, I convert it to a correlation matrix (B) and multiply newdata by inverse of the cholesky decomposition of B. This produces a data set very close to the original data simulated under fastBM so long as the number of characters is less than the number of taxa. If it's more, I get an error along the lines of "the leading minor of order 101 is not positive definite." The reason that I've determined is that the number of positive eigenvalues of the phylogenetically independent vcv is equal to the number of taxa minus one. Have you ever encountered this problem before? Is there a good solution? Best regards, -Will Gelnaw

I'm trying to test if my trait of study has been canalized over time. I don't just want to know if the breadth of the trait has been reduced, but I also need to know if the current trait values falls within the ancestral values. So far, my best idea has been to separately reconstruct max and min values for the trait and use that to estimate ancestral breadth. It might be a little tricky though because its a multivariate trait. I've been thinking that I would have to separately reconstruct the max and min for each PC axis. If this method is flawed or there's already a way to test this, can you let me know? Thank you for all the help you've already given me through this blog. You are a hero of grad students!

Thanks for providing such great resources. I have a couple of questions that I cannot seem to find answers for (and hope I have not missed anything obvious).

Is it possible to outline branches in plotSimmap? I would like to use black, grey and white for simplicity, but only if I can see the white branches...

Also, I notice that the colour at the nodes (vertical branches on a square phylogeny) are the colour of just one of the tipwards branches (the bottom branch). Is it possible for the nodes to be coloured equally by the two branches?

I'd like to plot/analyse a phyl.pca object with existing packages like e.g. with ggbiplot and thus am in the situation where I'd like to convert it into a prcomp or princomp object with sdev x rotation scale center attributes filled. I'm calling phyl.pca as follows: pc.pca<-phyl.pca(tree,df,method="BM",mode="corr")

I compute/fill the necessary attributes as: sdev=sqrt(diag(pc.pca$Eval)) x=pc.pca$S rotation=pc.pca$L scale=FALSE center=FALSE

But somehow this doesn't seem to be correct. The sd seems to be wrong. Also I'd like to know how to obtain scale and center values...

Lastly, I'd like to project supplementary variables onto the phyl.pc's coordinate space by cor(newvar.df,pc.pca$S), correct?

Sorry for asking so many, hopefully not too stupid questions.

I am looking for a function to calculate the support of nodes in consensus trees (e.g. Bremer support) but I found none. Do you know if any package includes that? Ape, which has the function consensus() does not.

Do you plan to include any of these things in the future in phytools?

Hi Borja. You can get bootstrap values for a set of trees from which we have computed a consensus tree using the function prop.part from the ape package. With regard to Bremer support - there may be some way to compute these values using the phangorn package. I would email Klaus, the author of that package, or R-sig-phylo. Borja, it was great to meet you in TucumÃ¡n. Hopefully our paths cross again in the future! All the best, Liam

I used the function phyl.pca with method="lambda" to perform a pPCA. The optimal lambda is less than 1 (0.747, precisely). How can I test if this is significantly different than 1 for the same dataset and tree?

Hey Liam, thank you so much for solving other people's problems. I got some of your tips by Google and now I can calculate K values & its p-value after three weeks dealing with data format (rooted & dichotomous) and so on. Keep doing good things. Cheers.

Hi, I am trying to generate a list of apomorphic discrete character changes for all nodes in a tree in R. This is the kind of thing that nearly every piece of phylogenetics software outside of R can do, but I need to extract data from such a list for a large number of trees in a bayesian sample. Any suggestions? JS

It may sounds like a very simple question but I need to simulate birth-death trees by means of the following code. It does not always return wanted number of trees so I was wondering how can I change the code to make it conditioned on the number of returned trees with n extant species.

While counting the number of clades in a set of "rooted" bootstrap trees using prop.clades(), I see, apparently erroneous, results. I have posted the question on stackoverflow (the link is http://stackoverflow.com/questions/33811950/how-to-properly-use-prop-part-and-prop-clades-functions-in-ape-for-bootstrap).

Can you please quickly look at the post to see if I am calling the wrong method or its just that my interpretation is wrong. Thanks in advance, Ikram Karolinska Institute, Stockholm

Im working right now on a phylogeny using the picante function "match.phylo.data", and i am getting again and again this error

> mdata<-match.phylo.data(CeroxTree, Biox) [1] "Dropping taxa from the data because they are not present in the phylogeny:" [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" [20] "20" "21" "22" "23" "24" "25" "26" "27" "28" [1] "Dropping tips from the tree because they are not present in the data:" [1] "Aphandra_natalia" "Ammandra_decasperma" [3] "Phytelephas_tumacana" "Phytelephas_aequatorialis" [5] "Phytelephas_macrocarpa" "Phytelephas_tenuicaulis" [7] "Ravenea_sambiranensis" "Ravenea_krociana" [9] "Ravenea_hildebrandtii" "Ravenea_moorei" [11] "Ravenea_robustior" "Ravenea_rivularis" [13] "Ravenea_albicans" "Ravenea_xerophila" [15] "Ravenea_louvelii" "Oraniopsis_appendiculata" [17] "Juania_australis" "Ceroxylon_ventricosum" [19] "Ceroxylon_amazonicum" "Ceroxylon_echinulatum" [21] "Ceroxylon_parvifrons" "Ceroxylon_quindiuense" [23] "Ceroxylon_parvum" "Ceroxylon_vogelianum" [25] "Pseudophoenix_vinifera" "Pseudophoenix_sargentii_var_navassana" [27] "Pseudophoenix_lediniana" "Pseudophoenix_ekmanii" Warning message: In drop.tip(phy, setdiff(phytaxa, datataxa)) : drop all tips of the tree: returning NULL

The point is that the taxa names of the phylogeny and the dataset dont match, but I cant understand why. Even if in the error the taxa names of the dataset appear to be number, in the real data set they are not. I have checked the taxa names of the phylogeny and the data set and they do match. So I dont understand why instead of the real taxa names that I have in my dataset, the function uses the numbers of the rows of the dataset (these are not a column). Am I clear?

I have encountered an issue in the behavior of tiplabels when added to a densityMap plotted facing leftwards. When I try to add pie-style tip labels, I get the following error: > tiplabels(pie=to.matrix(Short[HeightDMap$tree$tip.label],sort(unique(Short))),col=cols2) Error in symbols(xpos, ypos, circles = radius, inches = FALSE, add = TRUE, : invalid symbol parameter

The command I am using to plot pie tip states works perfectly when the density map is plotted to the right. Using thermo instead of pie with an identical argument works fine facing left as well. I tried it with the original ML tree too, and pie worked fine facing left as well. I haven't been able to find any mention of this specific scenario here or elsewhere. Any suggestions?

Margaret

p.s. I'm still learning to use phytools, but so far it's impressively powerful!

Hi Liam I am a PhD student, my project about behavioral evolution study, I need to ask, do you have any training course about using R software for reconstructing phylogenetic tree?

in which chars is a matrix with taxon names (same as tr's tip labels) as row.names and tip states (single-character strings) in a column (the only column).

But then I get this error message:

Error in rowSums(xx) : 'x' must be an array of at least two dimensions

I suspect that the problem is in the format of x. I've also tried inputting x as a named vector, to no success.

I realize this is probably a stupid question, but that's because I'm now to phytools, and to R, and to comparative methods in general.

I have been using your function 'treeSlice' to section a tree at different time intervals (5mya, 10mya, etc). However, the subtrees are collapsed onto the crown, so that a subtree cut at 5mya will have an age of 2my, because the first 3my is a single branch. Do you know of a way to either prevent this from happening (so the subtrees are not collapsed onto the crown) or, after the fact, append a branch at the root (to "restore" the stem)?

I am use cophylo to compare one hierarchical clustering with one phylogenetic tree. I wonder if there is any way to color the tip (in my case species name)?

I have a question about the heatmap associated with my phylogeny. When part of my tree is collapsed, can the heatmap values in the collapsed part be accumulated?

It works well, but there are some mismatches between the real characters and the tips states on tree.

from the picture, some mismatches could be find: For our BRP as "1" character, but it labeled as "?" in tree, For MH_I, MH_C were as "1" character, but label as "0"; For Svulgaris, Slatifolia were as "0", but labeled as "1"; For Sturkerstanica as "?". but it was labeled as "1"

I have 22 characters, I try the other characters, they also show this mismatches. Do you know what's the problem for those? any suggestion on this?

After reconstruction of ancestral characters by rerootingMethod and make.simmap, I used nodelabels to plot pie charts of reconstructed characters onto internal nodes. However, I found the default size of the piecharts are too big for my tree. May I ask if there's any way to adjust the size of piechart in phytools?

I have produced LTT plots for a number of clades, but the phylogenies they are based on are not scaled to real time (instead they are ultrametric trees scaled to substitutions/site).

This makes the relative time x axis on the LTT plots not very useful to me. If you knew of any way to manually callibrate real time scales directly to the LTT plots, I would be very interested. I have root divergence dates from the literature that could be used in this, but I'm not sure how to add these to the plots.

I did reconstructions of ancestral discretely-valued states under ARD and SYM models (ARD was suggested the best model according to comparison by using geiger, and SYM the second). Likelihood values for the ancestral states at a few internal nodes were negative, and thus failed the plotting of nodal piecharts. Reconstruction under ER generated no negative likelihood values.

No, but my guess is that they are zero and are showing up as very slightly less than zero due to numerical precision (limitations in how precisely a computer can store a number). You could try rounding the object to (say) 8 digits, i.e.:

How can I test phylogenetic signal from geometric morphometrics data? I have the tree of 14 spp. of parasites and morphological traits..there are examples of these?? Tkanks, Abril

I have a quick and very basic question: is there any function in phytools to fit models of continuous traits evolution (something analogous to fitContinous in geiger) that admits missing data in the traits matrix?

Thanks in advance and congrats for this blog! Susana

Liam, This blog is awesome! I'm having an issue with ladderize {ape} and Ancestors {phangorn} that I'm hoping you can help me out with.

library(ape) library(phytools) library(phangorn)

#make tree with 5 tips and 4 internal nodes tree<-read.tree(text="(t1:228,((t2:57.065588,(t3:8.68127,t4:8.68127)100:48.384317)99:19.099685,t5:76.165273)100:151.834727);") tree is.ultrametric(tree) #its ultrametric plot(tree) #works plot(ladderize(tree)) #works

#add a tip to an internal branch above the node shared by t2 and t3 x<-match("t2",tree$tip.label) y<-match("t3",tree$tip.label) a<-Ancestors(tree,x) b<-Ancestors(tree,y) z<-a%in%b node<-a[min(which(z))] position<-0.5 * tree$edge.length[which(tree$edge[,2]==node)] new.tree<-bind.tip(tree,"t6",where=node, position=position)

plot(new.tree) #works plot(ladderize(new.tree)) #error about non-splitting nodes plot(ladderize(reorder(new.tree))) #still getting error about non-splitting plot(ladderize(collapse.singles(new.tree))) #still getting error about non-splitting #plot(collapse.singles(ladderize(new.tree))) #crashes R

tree$edge ladderize(tree)$edge #name number of rows

I've seen comments about phangorn::midpoint messing up ladderize, but it looked like the reorder function would fix it, which doesn't seem to be the case here. Any info would be great Liam, Thanks! -Travis

I am trying to do phylogeography using make.simmap to estimate ancestral states with nsim=100. I am concerned about that I have biased sampling for some of the groups and how can this affect the results. However, I see the object $counts created after running make.simmap with nsim contains a first column (N) with a different number per simulation. I would like to know what is this function really doing during the simulations, is it sampling randomly tips from the tree each iteration independently of their prior group or is it sampling taking into account the number of tips in each group (so, balancing number of tips from each group)? Many thanks for your help, Best,

I am trying to do a simple pPCA with a molecular phylogeny I made for 13 species. I have morphometric data for 6 morphologies on each specimen (n=2310). I have 2 questions: Q1) in my X matrix for the phyl.pca() function am I just using the average measurement for each morphometric? Q2) I keep getting the error: Error in invC%*% X : requires numeric/complex matrix/vector arguments.

When I convert X from a data frame to a matrix it changes to characters from numeric and I'm wondering if that has something to do with this problem?

Thanks for a super helpful blog! One question- I've noticed a few papers, when performing an ancestral state reconstruction, saying they only set nodes "with significant support (i.e., a single state that was at least 7.39 times more likely than the next most likely state; Pagel 1999)."

How is this 7.39 value calculated and where does 7.39 come from?

Thanks so much for this blog, it's helped me immensely.

I need to add a sister taxa to a nexus tree I have in R and have pruned.

I followed your tutorial on adding sister taxa (http://blog.phytools.org/2015/08/adding-tip-to-sister-taxon-on-tree.html) but before I can do it, I need to convert the tree to a phylo object. When I try to do that, R gives me this message: Error in UseMethod("as.phylo") : no applicable method for 'as.phylo' applied to an object of class "character" Any idea why?

Thanks very much for a fantastic blog! It has really helped me with my research. I have a quick question regarding the (potential) double use of phylogenetic correction. We have phenotypic traits (morphological), and functional traits (bite force). We have used a phylogenetic PCA to obtain phylogenetically corrected PC scores of the morphological traits. We would like to regress the first PC axis scores against the bite force. Would it be statistically correct to use the phyl.PC scores and the bite force values in a PGLS? We are concerned that phylogeny would be corrected for twice by doing the stats in this way. Should we use traditional PC scores and bite force values, and then do a PGLS?

I have been doing some simple tip additions using the add.species.to.genus function looped across a multiphylo using a list of taxa with where="random". However, every 15 to 20 trees, an error is thrown that a tree is not ultrametric. Inspection of the problem trees shows that occasionally, a tip will be added with a branch length longer than the remaining length of the tree. This of course make the tree no longer ultrametric and causes the error to be thrown when my code tries to add the next tip. I am wondering if you have any thoughts on why this might be occurring, or how I might try to solve it. Is there a minimum branch length that must be added using this function?

Thanks for the great resources that you provide. However, I am running into a problem with the midpoint.root() function for my tree. I get the error message > Error in phy$edge[, 2] : incorrect number of dimensions

I was able to narrow done the source of the error to the splitTree function that is called by the reroot function which is called by the midpoint.root function but I didn't get any further.

My Tree is generated by FastTree and made ultra metric with PATHd8 (although the same error occurs before that step).

Hey again. I dogged a bit deeper and the problem seems to be that splitTree() calls extract.clade() and in extract.clade, anc is found to be node 1117 and the smallest of 7 next.anc is node 1. Therefore *keep* evaluates to *start* + 0:0 and hence just *start* of length == 1. The the edge is updated to phy$edge <- phy$edge[keep, ] which evaluates to an one-dimensional numeric vector of length == 2 wherefore TIPS <- phy$edge[, 2] throws the error above.

Yet, although I understand what happens. I don't really understand why it happens and what I could do to fix it. Any help would be greatly appreciated!

Hi Fabian, I'm getting the same error message when using splitTree. I'm wondering if you found a solution (or a work-around) to this problem? Best, -Evan

Great package. I am trying to add a factor (binary with NAs) to a contMap before the species names, like a coloured square or something similar. Is there any easy way to do this?

Thank you for a fantastic toolset. In some tree-building programs (like RAxML), samples with identical seq. data are removed from analysis. I'm wondering then if it's possible to label tips with multiple sample names. I'm interested in using your co-phylogeny tools to examine consistency between different trees based on plasmids from the same bacterial sample, however, I want to make sure the same sample names are included in each of the trees, if that makes sense.

I am really battling with phytools as i am very new to the program. I need to get the mean values at each of the nodes in my tree. I have run ace and succeeded but i am battling to get the means for the nodes. Please can you help.

Dear Liam I also do have (maybe) a silly question. Is it possible and useful to rotate the loadings of the pPCA with varimax? And how would I do this? Best, Andrea

Hello Liam, I am using phytools to compare rates of speciation between species with different phenotypic traits using the Threshold Model with discrete charecters which are coded for in binary in my matrix. I am getting the problem of "subscript out of bounds" and I am not sure how to fix it. My tree is in nexus format, with 106 tips. The tip labels have been changed to numbers (I guessed from 1 - 106 according to their order in my input file?). My matrix is 3 columns, first represents the species number, the second is if they represent one trait and the second is if they represent the other. This matrix is 106 columns too. A species has to be one of the traits but cannot be both. I wondered if you are able to suggest a reason why the subscript is out of bounds? Many thanks, Josie

Thanks for a great blog! I am trying to estimate discrete ancestral states on a tree and when I use ace with the ARD model: fitARD<-ace(region,tree,model="ARD",type="discrete", use.eigen=TRUE)

I get the error: In nlminb(rep(ip, length.out = np), function(p) dev(p), lower = rep(0, : imaginary parts discarded in coercion.

I also get a rate estimates of 0 for several of the rates that do not seem to be 0. I am wondering how to address this error?

Hi Liam, Thanks for the great resource! I have a question that isn’t necessarily related to phytools, but phylogenetics in R in general. I’m wondering if you have some insight and potentially a phytools-based solution.

I’ve been having some trouble working in R with unrooted trees (with bootstrap values) inferred with RAxML like this one:

The problem is that when I root this tree by an outgroup and display (like below), I find that the bootstap scores have been moved around and one of the nodes has a blank value for a bootstrap score.

I believe this has to do with the fact that Ape (and other R packages and tree viewers) treat support values as pertaining to nodes on the tree rather than internal edges. This problem is described in: L. Czech and A. Stamatakis, “Do Phylogenetic Tree Viewers correctly display Support Values?,” bioRxiv, p. 35360, 2015.

I found a work-around that I think will work for me, which is to plot the rooted tree but display the node labels from the unrooted tree (see below).

plot(rooted_tr) nodelabels(tr$node.label)

I'm just wondering if you have any ideas for avoiding this quirk when rooting trees in R.

Hi Lima, Thanks for creating such a great package and blog. I'm having a problem removing tip.labels from a countMap plot. I have been using contMap with plotTree.wBars but I've decided I no longer need the bars and would just like the contMap but without labeled tips, which I cant seem to do? im' ploting a 450 tip tree as a "fan". I have made a work around buy still using plotTree.wBars and making x = 0, and border="white", which is fine but I thought there might be a more elegant way? Best regards Matt Larcombe

I have tried to build a SIMMAP null model. So, I have shuffled the tips (discrete states) from original tree and re-ran a SIMMAP (make.simmap function) to store the ages of state transitions . When I run make.simmap using original tree, everything works well. However, when I shuffle the tips, some estimations last forever. Are there some combinantions unfeaseble to reach a stable solution? I guess sometinhg happens with the estimated Q matrix. I would appreciate a lot any advisement.

#Original tree map=make.simmap(tree,state,nsim=1000,model="ER")

#Null model map_rand=list() for(i in 1:1000){ state_rand=sample(state,length(state),replace=F) map_rand[[i]]=make.simmap(tree,state_rand,nsim=1,model="ER") }

When the values estimated for Q matrix are very extreme, the function runs forever. For example, I waited 24h with no solution for this Q matrix:

make.simmap is sampling character histories conditioned on the transition matrix Q = 1 2 3 1 -5512.302 2756.151 2756.151 2 2756.151 -5512.302 2756.151 3 2756.151 2756.151 -5512.302 (estimated using likelihood); and (mean) root node prior probabilities pi = 1 2 3 0.3333333 0.3333333 0.3333333

Is there a blog post / function capable of combining the functionality of contmap and dottree? I'm looking to plot a continuous character with a color ramp on the tree with a discrete character dot next to the tips of the tree. If I there is a post on this I have missed it in my search and would appreciate being pointed in the right direction if it is out there.

Hi Liam, I'm having issues with using the make.simmap function. I'm trying to set the prior distribution on the root node using the following arguments:

and I had this Error message: Error in getPars(bt, xx, model, Q = NULL, tree, tol, m, pi = pi, ...) : formal argument "pi" matched by multiple actual arguments

Could you help me to fix this problem? Thanks a lot for your attention!

I really enjoy the plotBranchbyTrait function. I'm plotting substitution rates and the color scale really makes the long branches pop.

Everything was going great for a few weeks, and now with the same trees I used earlier I'm no longer able to get the scale bar to add. Oddly, I can get it to add with your demo example, so I guess it's something with my trees/data, although the same script worked fine a few weeks back. It plots my tree with lovely colors, but never asks about the scale bar.

My basic pipeline is this:

1) Read in tree with branch lengths 2) Create a branch length variable 3) Plot with plotBranchbyTrait calling the tree and branch length variable, method="edges"

With no error messages or warnings, I'm lost. Any ideas for troubleshooting?

Thank you very much for your time and contribution to the R world!

Hard to diagnose without the example in hand. If you save your workspace & Rhistory & email them to me, I can try to reproduce the error & get to the bottom of it. Liam

is there a function in Phytools similar to "plotRateThroughTime" or "getRateThroughTimeMatrix" of the package "BAMMtools"? These methods allow to plot evolutionary rates through time, e.g., http://lukejharmon.github.io/ilhabela/instruction/2015/07/02/diversification-analysis-bamm-rpanda/ (see the "Rate through time plots"). Your rates (or any other continuous data) are associated with the edges of your tree and plotted against time.

Unfortunately, the BAMMTools methods only work with bammdata objects (I'm also not sure, whether they can handle non-ultrametric trees). I was just wondering, whether phytools offered a similar option for more generic data (tree + vector or your data)?

After all, using "plotBranchbyTrait" of the "phytools" package already enables us to plot continuous values (as colours) onto the edges of our phylogeny. The generated object must therefore already contain the association between edges and our continuous data. Is there an easy way to plot our data against time (or any other branch length unit) as in the examples mentioned. See Cooney et al. (2017, fig. 2c) for another example of these plots: http://www.nature.com/nature/journal/v542/n7641/full/nature21074.html

Dear Liam, i have a question for you; Do you know any R package that can run ancestral state characters reconstructions for continuous character under linear (Wagner) Parsimony? Thanks in advance !

Andres. If any package can do this, it would probably be phangorn. I would contact the developer, Klaus Schliep (who works in my lab), if you have any questions.

Liam, thanks for your quick answer. Unfortunately, if i am not mistaken, phangorn package uses molecular data (dna sequences) instead of continuous morphological data. Or, i´m missing something...

Liam, thanks for your quick answer. Unfortunately, if i am not mistaken, phangorn package uses molecular data (dna sequences) instead of continuous morphological data. Or, i´m missing something...

Hi Andres. phangorn can analyze any type of discrete character through the data type "USER", not just molecular data. I'm not sure how Wagner parsimony is defined for continuous traits. It is possible this is equivalent to what was once called 'linear parsimony.' Linear parsimony, if I remember correctly, involved identifying the set of ancestral states with the minimum sum of absolute values of changes. There is a related method called 'squared-change parsimony.' This method involves minimizing the sum of squared changes; however if we take edge lengths of the tree into account we have 'weighted squared change parsimony' - which turns out to give us the same solution as the ML estimates under a Brownian assumption. I'm not aware of any similar justification for linear parsimony, nor am I aware of an implementation of linear parsimony in R - although it would not be hard. - Liam

I was wondering how I can exclude the past few million years of branches from my time tree. I have a phylogeny that began diversifying around 40MYA and I wanted to see how the gamma statistic value changes when I exclude recent speciation events. I tried searching for this, but may be I am not searching by the right key word. Regards, Aparna

I wanted to map the CI on the timed tree after pruning it. To do this, how can I get the matrix of CI from a posterior distribution of trees (empirical data)? Thanks, Aparna

I have encountered an issue when I wanted to plot a phylogenetic tree with branched colored by values for a quantitative trait and I'm hoping you can help me out with. The problem is that the values of trait is assigned randomly to the species so the tree does not represent the actual data. Here is my code: library(phytools); library(picante) TREE.R20120829<-"(((((((((((((((((((((((Viola_riviniana,Viola_americana,Viola_pedatifida)Viola,(((((((((((((((((((((Trifolium_repens)trifolium))))))))irlc)))))))))papilionoideae))))fabaceae))fabales)))fabids))rosids),(((((((((((((((((((((Verbena_stricta)Verbena)verbenaceae)))),((Veronica_montana)Veronica)scrophulariaceae))))))lamiales)))lamiids,((((((((((((((((((((((((((((((((((Artemisia_dracunculus,Artemisia_frigida,Artemisia_giraldii,Artemisia_sacrorum,Artemisia_scoparia,Artemisia_tridentata)artemisia))))))))))))))))))))))))))asteraceae)))))asterales))campanulids))ericales_to_asterales)asterids)))))core_eudicots)trochodendrales_to_asterales)sabiales_to_asterales)eudicots)ceratophyllales_and_eudicots,(((((((((((((((((((((((((Zea_mays)zea)))))))pacc,(((((((((((Triticum_aestivum)triticum,(Bromus_briziformis,Bromus_inermis,Bromus_tectorum)bromus))))))))))bep))),(Festuca_arundinacea,Festuca_idahoensis,Festuca_ovina,Festuca_rubra)Festuca,(Agropyron_cristatum,Agropyron_desertorum)Agropyron)poaceae))))))poales)commelinids))))))monocots)poales_to_asterales)magnoliales_to_asterales)austrobaileyales_to_asterales)nymphaeales_to_asterales)angiosperms,((((((Pinus_massoniana,Pinus_sylvestris,Pinus_taeda)Pinus)pinaceae)pinales)))gymnosperms)seedplants)euphyllophyte;"

phy.R20120829<-read.newick(text=TREE.R20120829)

phy.R20120829<-collapse.singles(phy.R20120829)

traits <- read.table("D:\\Gordon\\Review\\R codes and files\\test.csv",header=TRUE,sep=",", quote="", row.names = 1)

This is “b” that the values for each species is completely different from above data: > b Viola_riviniana 0.31310231 Trifolium_repens 0.47677369 Artemisia_dracunculus 1.06745099 Artemisia_sacrorum -0.04652002 Zea_mays 1.04949307 Bromus_inermis 1.33913581 Festuca_idahoensis 1.50407740 Agropyron_cristatum 2.84354046 Pinus_sylvestris 1.42138568

I am lost! Any suggestion for troubleshooting? Thank you so much for your time and contributes.

nice

ReplyDeleteSo far as I can tell, the above comment is spam - but since it is the only one I've got (as of 10/3/2011) on a whole page dedicated to comments, I have chosen to leave it up.

ReplyDeleteHi Liam,

DeleteI've been applying evol.rate.mcmc to some data and am trying to get my head around the analysis. From what I understand it finds a single rate shift in rates of phenotypic evolution within a tree. Is there any way to test the 'significance' of any identified shifts (i.e. especially for more subtle identified shifts)? I hope that question makes sense, thanks for any advice.

regards

Ha! Here's a real comment for you: I am having a hard time using the phyl.pca function in phytools. No matter that I am passing the mode "cov", I get an error.

ReplyDelete> pca<-phyl.pca(C, Y, method="BM", mode="corr")

"Error in colSums(invC %*% X) : error in evaluating the argument 'x' in selecting a method for function 'colSums': Error in invC %*% X : requires numeric/complex matrix/vector arguments. Any ideas?

Hi Anonymous.

ReplyDeleteIs C the vcv matrix from the tree? phyl.pca (unlike the function phyl_pca published in the appendix to Revell 2009) takes the tree not the vcv matrix as input, so you would do:

> pca<-phyl.pca(tree,Y,mode="corr")

If this is not your problem, I would be happy to troubleshoot any issues. If it is, then sorry for the confusion!

- Liam

That was it! Thanks! BTW, I like phytools! Very useful so far!

ReplyDeleteDear Liam,

ReplyDeleteYour blog is very interesting. As you work with Simmap, I would have a very basic question: I want to do stochastic character mapping on 10000 Bayesian trees in Simmap and display the results on the consensus tree. What output files do I have to use - is it the mutational maps, which can be saved optionally? Do I have to summarize all individual maps to display them on the consensus? I had no problem to make ASR in Simmap, but after checking the manual I still don't understand how to get the output for stochastic character mapping and how to display the results. Do you have any hint for me?

Thanks,

David

Hi David.

ReplyDeleteI have nothing to do with the program SIMMAP which is maintained by Jonathan Bollback. The phytools function make.simmap can do basic stochastic mapping. Other functions (read.simmap, write.simmap) in the phytools package read & write trees in the SIMMAP format.

In the future I hope to add some of the functionality you are talking about to the phytools package, but I am not there yet.

Thanks for checking out phytools.

- Liam

Hi Liam,

ReplyDeleteI am a PhD student of Vladimir Minin's at UW, and we have noticed some potentially undesirable behavior with the

optim.phylo.lsfunction.What we observed is that there appears to be some stochasticity when using the "fixed=TRUE" option; that is, it would sometimes return different answers when run repeatedly with the same input tree and sequence alignment.

The culprit appears to be the

multi2difunction withinoptim.phylo.ls. Themulti2difunction purports to resolve polytomies by explicitly denoting them as having branches of length 0, as opposed to a lack of a branch whatsoever. However, it is self-admittedly a stochastic procedure, in terms of exactly where it adds branches -- it seems that it should not matter in principle since it is merely adding branches of length 0. However, there appear to be some undesired consequences when each of these resolved trees are used inoptim.phylo.lsAs a demonstration, consider the following multifurcating tree:

http://i16.photobucket.com/albums/b37/peterchi/T1.png

Now, suppose we want to use this as the fixed topology on which to estimate the least squares branch lengths for a given sequence alignment. Let's say that we have saved the matrix of pairwise distances (according to JC69) as dist2.stan. Then, we would use the following R code:

T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

However, if we run this repeatedly, we will observe that we obtain different answers between iterations:

> T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

best Q score of 0.0040827015 found after 0 nearest neighber interchange(s).

> T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

best Q score of 3.27601e-05 found after 0 nearest neighber interchange(s).

> T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

best Q score of 0.0040827015 found after 0 nearest neighber interchange(s).

> T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

best Q score of 0.0033840255 found after 0 nearest neighber interchange(s).

> T2<-optim.phylo.ls(dist2.stan,stree=T1,fixed=TRUE)

best Q score of 3.27601e-05 found after 0 nearest neighber interchange(s).

As mentioned above, the cause of these discrepancies appears to be the

multi2difunction. Observe what happens when we repeatedly plot the multi2di(T1) result; that is:pdf("multitrees.pdf",height=4,width=6)

par(mfrow=c(2,3))

plot(multi2di(T1))

plot(multi2di(T1))

plot(multi2di(T1))

plot(multi2di(T1))

plot(multi2di(T1))

plot(multi2di(T1))

dev.off()

We then get the following trees:

http://i16.photobucket.com/albums/b37/peterchi/multitrees.png

And, although they clearly are all the same unrooted tree, we can examine the structure of how each tree is stored (i.e. its edges), as follows:

http://i16.photobucket.com/albums/b37/peterchi/edges.png

It thus seems to be this randomness that is causing the different resulting Q-scores and optimized branch lengths from a fixed topology.

We have been able to come up with a solution for our own purposes, but we just thought we would bring this observation to your attention.

Hi Peter.

ReplyDeleteThanks for bringing this to my attention. Sorry I have not responded sooner. I'll look into it as promptly as I can!

All the best, Liam

Hi Liam,

ReplyDeletedo you know any function that could allow finding the deepest node ancestral to a group of tips? (I mean the oldest MRCA of a group of tips)

thanks in advance!

Yes, I can help you with this. I think that, by convention in "phylo" objects, nodes closer to the root node

ReplyDeleteon a given path from the roothave smaller node numbers then do nodes on the same path that are farther from the root. That is to say, node 13 is not necessarily closer to the root than node 19; but, if for a set of four taxa, the pairwise MRCAs (excluding duplicates) are: 16, 15, & 14, then we know that MRCA of all 4 is 14.Just in case this is not guaranteed to be the case, I have written a function that uses ape::mrca & phytools::nodeHeights to compare the pairwise MRCAs of a list of species to the heights of the MRCAs above the root, and then picks the one with the lowest height. I will post it shortly.

This is here.

ReplyDeleteHi Liam.

ReplyDeletea simple question, how in trees mapped with "make.simmap", you can get the heights in the tree in which there are transition points between mapped states?

thank you very much!

Hi Liam,

ReplyDeleteI am using your anc.Bayes function in phytools with a nexus tree and continuous data. When I run the line "anc.Bayes(tree,data,ngen=10000,control=list())" I get an error message:

"Error in if (post.odds > runif(n = 1)) { : missing value where TRUE/FALSE needed"

Do you know what this might be? I've had no problem with my tree or data in geiger or ape.

Thanks.

My guess is that in your during the MCMC one of: 1) the log-value of the likelihood for the current state of the chain; 2) the log-value of the likelihood for the proposed state; or 3) either of the log-priors; are numerically indistinguishable from -Inf or Inf. One thing you could do to check this is rescale your data by multiplying by 10 or 1/10. In addition, if you send me your dataset I would happy to try and diagnose the issue - and it would also be helpful in improving phytools.

ReplyDelete- Liam

I have sent an email to liam.revell@umb.edu with the relevant files attached (subject: phytools query).

ReplyDeleteThanks.

Hi Liam,

ReplyDeleteSuggestion for you. How about adding % variance explained for the PC axes as part of the output of phyl.pca? Not hard for someone to do separately, but it would be useful output to add. I'm digging phytools.

D

Hi, could you help me out with this? I'll just post my entire script-run.

ReplyDelete> rm(list = ls(all = TRUE))

> setwd("M:\\pc\\Desktop\\Syntese MANUS\\Nye MHC verdier")

> library(ape)

> tree<-read.nexus("tree.nex")

> exon<-read.table("exon.txt")

> attach(exon)

> X<-as.matrix(exon)

> class(tree)

[1] "phylo"

> str(tree)

List of 4

$ edge : int [1:14, 1:2] 9 10 10 11 11 12 12 9 13 13 ...

$ edge.length: num [1:14] 1 1 1 1 1 1 1 1 1 1 ...

$ Nnode : int 7

$ tip.label : chr [1:8] "Ficedula_hypoleuca" "Hylocichla_mustelina" "Luscinia_svecica" "Oenanthe_oenanthe" ...

- attr(*, "class")= chr "phylo"

- attr(*, "origin")= chr "tree.nex"

> class(exon)

[1] "data.frame"

> str(exon)

'data.frame': 8 obs. of 6 variables:

$ V1: Factor w/ 8 levels "Ficedula_hypoleuca",..: 1 2 3 4 5 6 7 8

$ V2: int 112 175 154 149 114 160 125 162

$ V3: int 159 259 241 228 167 243 183 257

$ V4: num 0.163 0.208 0.18 0.175 0.154 0.205 0.176 0.154

$ V5: num 0.134 0.227 0.179 0.146 0.136 0.164 0.18 0.13

$ V6: num 0.202 0.257 0.219 0.22 0.184 0.194 0.209 0.188

> class(X)

[1] "matrix"

> str(X)

chr [1:8, 1:6] "Ficedula_hypoleuca" "Hylocichla_mustelina" "Luscinia_svecica" ...

- attr(*, "dimnames")=List of 2

..$ : NULL

..$ : chr [1:6] "V1" "V2" "V3" "V4" ...

> source("phyl.pca.R")

> pca.results<-phyl.pca(tree,X,method="lambda",mode="corr")

[1] "Y has no names. function will assume that the row order of Y matches tree$tip.label"

Error in invC %*% X : requires numeric/complex matrix/vector arguments

> pca.results

Error: object 'pca.results' not found

If you could help me out, I would be forever grateful.

PS, I saw the similar issue in the other comments, but seeing as "tree" is simply that, a phylogenetic tree and not a distance matrix, I don't get why the script returns this error.

Try:

ReplyDeleteX<-read.table(...,row.names=1)

X<-as.matrix(X)

Let me know if that works. Liam

Dear Liam

ReplyDeleteI am new to this field and have a very question. I want to extract the species names (tip.label) of a tree. I downloaded this tree from one on the articles. It seems that this tree is a combine tree. If you help me how can I extract the species names.

The str of the tree is:

Class "multiPhylo"

List of 3

$ tree mammalST_MSW05_bestDates :List of 4

..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ...

..$ Nnode : int 2503

..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ...

..$ edge.length: num [1:7522] 102.6 49.9 13.7 0.2 13.5 ...

..- attr(*, "class")= chr "phylo"

$ tree mammalST_MSW05_lowerDates:List of 4

..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ...

..$ Nnode : int 2503

..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ...

..$ edge.length: num [1:7522] 114.2 40.5 11.5 0.2 11.3 ...

..- attr(*, "class")= chr "phylo"

$ tree mammalST_MSW05_upperDates:List of 4

..$ edge : int [1:7522, 1:2] 5021 5022 5023 5023 5024 5024 5022 5021 5025 5026 ...

..$ Nnode : int 2503

..$ tip.label : chr [1:5020] "Tachyglossus_aculeatus" "Zaglossus_bruijni" "Zaglossus_bartoni" "Ornithorhynchus_anatinus" ...

..$ edge.length: num [1:7522] 91 59.3 15.9 0.2 15.7 15.7 75.2 13.1 44.4 8.4 ...

..- attr(*, "class")= chr "phylo".

The tip names are in the vector $tip.label. You can pull these out from each tree in the 'multiPhylo' object using (say) tree[[1]]$tip.label (for the 1st tree, for example).

DeleteIt worked! Thanks a bunch :)

ReplyDeleteDear Liam,

ReplyDeleteI am trying your phyl.cca function but I keep getting the following error:

Error in ccs > 0 : invalid comparison with complex values

is this a format problem with the data matrices or something else is going on?

thanks for your reply and for developing this awesome 'phytools'

cheers

Arley

Hi, Liam,

ReplyDeleteLike the above individual I'm also trying to use the phyl.cca function, but have been encountering three main issues:

1. Although I have the latest versions of R and phytools, R cannot find the function. It's definitely spelled right...

2. When I source the R code instead, the lambda estimate statement doesn't seem to register (the exact error: unused argument(s) (fixed = F)). I'm guessing this is because the code posted 9/7/2011 doesn't include updates to use lambda.

3. If I take out the lambda part, I get a "subscript out of bounds error," which I know must have something to do with matrices. I've tested the format with is.matrix and the rownames are identified (furthermore, I've successfully used

phyl.pca, which has similar format requirements).

Any suggestions? I'd be happy to send my code and files if needed.

Thanks for taking a look at this. I enjoyed the comparative methods symposium at Evolution in Ottawa!

Mercedes Burns

I have actually fixed the problem I had with phyl.cca

DeleteI just had to re-order the rownames in the data matrices to match the order of the taxa in the tree, and it worked like a charm!

For the record, I'm using version 0.1-8.

arley

Hi Mercedes.

DeleteThe first thing I would do is install the latest version of phytools. You should first detach phytools or (better yet), start a new R session. You can then update the package using update.packages() or install.packages("phytools"). Try to run it again and if you are not able, please let me know.

- Liam

@Arley -

DeleteI'm glad you got this to run, but the report that the order matters concerns me. Did the rows have names beforehand? If they are named (in theory) it should not matter - so I should investigate this!

- Liam

Thanks for your response, Liam! Issues 1 and 2 are resolved, but I'm still getting the subscript out of bounds error. I'll try Arley's solution of reordering the rownames, but yes, let us know if there's a fix!

ReplyDeleteMercedes

Hmm, the reordering rows trick didn't work for me. I get a subscript out of bounds error when I run:

Deletedata<-phyl.cca(tree,X,Y,fixed=F)

X is a matrix with 5 columns, Y is a matrix of 2 columns. The rows are identical in number and name. Any thoughts? Thanks again.

Mercedes

Hi Mercedes.

DeleteHard for me to diagnose, but if you send me the input files, I can try and troubleshoot. This will also help improve phytools if there is a bug.

liam.revell@umb.edu

- Liam

Hi Liam,

ReplyDeleteUsing brownieREML(), I am testing whether the evolutionary rate of a continuous trait (y) is contingent on a discrete multistate trait x <-c("a","b","c").

The evolutionary history of the multistate trait was reconstructed with the function make.simmap(x, model=c("SYM")). However make.simmap assumes that transitions are symmetric (model="SYM"), such that transitions from "a" to "b" are as likely than from "b" to "a". However i don't have a biological justification to assume this a priori. So, I wonder if there is a statistical justification for this assumption or should i try other models (e.g. model=c("ARD"). Unfortunately, results change when i do the latter.

Many thanks for you time and help, and congratulations for this helpful Blog.

Best wishes, Gabriel

Hi Gabriel.

DeleteIf your results seem sensitive to the mapping method, you might also consider running stochastic mapping in the stand-alone program SIMMAP. It can do things not presently implemented in phytools, including sampling across model parameter values. You can export your data & tree using the phytools function export.as.xml (see ?export.as.xml for details) which will create an input file for SIMMAP.

Good luck.

Liam

Thanks Liam. I modified the ace() function to incorporate a designed matrix, for which i have some biological support. This seems to work well with make.simmap(). Cheers, Gabriel

DeleteDear Liam, I am wondering, is it possible to give different colours to the taxa after pruning by using the drop.tip. I just want to show the locations of the used species in the whole tree. All the taxa in tree retains but with different colours? (e.g. used species blue or red and pruned ones grey)

ReplyDeleteI think this is what you're going for: http://phytools.blogspot.com/2012/09/plotting-with-leaves-and-edges-to-be.html

DeleteDear Liam

DeleteThat is really cool... Thumbs up..

Dear Liam Your fancyTree command is really cool and just what I needed. I am very new to this field and please excuse me for my silly question.

ReplyDeleteI want to use this command but I have no idea how I can select the species that I want to prune. I have a super tree with thousands of species and there is my data set with few hundreds species. Now, I am not able to figured it out how to tell which species to retain and which to prune.. Could you help me in this regard.

Hi. If just want to, say, prune your tree to contain only the species represented in your dataset you can just do (assuming your data are in matrix X, and the row names contain species names):

Deletetips<-setdiff(tree$tip.label,rownames(X))

ptree<-drop.tip(tree,tips)

Or, if you want to visualize it using fancyTree:

tips<-setdiff(tree$tip.label,rownames(X))

ptree<-fancyTree(tree,type="droptip",tip=tips)

- Liam

Dear Liam, thanks for the reply. I tried the code but the last line of the code, ptree<-fancyTree(tree,type="droptip",tip=tips)

Deleteis not working. it gives the following message

Error in plot.window(...) : need finite 'xlim' values

I don't know why is not working.

Is there any possibility that the vector, tips, is either empty or contains all the tips in the tree (or, say, all but one). You may also want to update to the latest version of phytools to make sure that you have the newest version of the function.

DeleteHey Liam, Do you have, or do you know of, a function that will do a phylogenetic MANOVA in which the canonical axes are evaluated while taking covariance due to phylogeny into account? I'm looking for something analogous to phyl.pca. I am aware that GEIGER has a function (phy.manova) to do a standard MANOVA and compare a test-statistic to a null distribution based on simulation under an empirically determined single rate matrix (ala Garland et al. 1993), but this approach does not provide phylogenetically informed canonical axes. I would like to know what variables are evolving differently in two groups of taxa. Also, neither group is monophyletic.

ReplyDeleteThanks for your help.

Unfortunately, I don't have a good answer. nlme::gls does not allow multiple response variables. Also, so far as I know, this has not been done with phylogenetic data - which makes me suspicious that simply extending generalized regression/ANOVA to multiple continuous dependent variables might be more complicated than it sounds. You could try the SIG-phylo list....

DeleteDear Liam,

ReplyDeleteIs there a way to perform non-parametric regression analysis like pgls?

Dear Liam

ReplyDeleteI would like to run a pgls analysis where I account for variation in the estimation of my species means. In other words, my response variable is a mean value calculated for each species in my phylogenetic tree. However, the number of samples used to calculate these means differs among the species in my tree. Is there any way to include a weighting factor that accounts for this variation (something like the square root of the sample size) in a pgls context? Do you have any suggestions for how to code this in R?

Many thanks

Nick

Hi Nick.

DeleteHave you tried the phytools function pgls.Ives? For this function, you need sampling variances on your estimates of the means for each species - which in this case would be the within-species variance divided by the sample size for each species.

If you

do nothave within-species variances, just the mean & sample size, please let me know as I'm pretty sure we could fit the Ives et al. (2007) model, but assuming that within-species variance was fixed across species and then estimating it. Obviously, it's better if you do know within species variances.Note that with pgls.Ives you have two options for input: you can give the function the species means, variances of the means in x & y, and sampling covariance between x & y;

oryou can just feed the function with the raw data for all individuals - as long as the ordering is the same for x & y.Unfortunately, so far I've only implemented the bivariate phylogenetic regression in this function.

All the best, Liam

Hi Liam

DeleteMany thanks for your reply. I can calculate the within species variance, but have the problem that for some species I only have one observation (n per species ranges from 1 to 32, and I have 14 species in my analysis). I can try and implement the solutions that you suggest in an earlier blog (http://blog.phytools.org/2012/06/blombergs-k-phylogenetic-signal-with.html), but what do you consider to be a suitably homogeneous intraspecific variance to make this option a possibility? In my case it ranges from less than 1 to nearly 40.

Best

Nick

Hi Nick. The within-species variances for a trait vary 40-fold? A couple of quick questions leap to mind:

Delete(1) This is the variance, not the sampling variance, correct? The relationship is sampling variance = variance divided by sample size. phylosig() takes the sampling variances as input, but when we only have one (or very few) observation(s) per species, I suggest using the mean within-species variance as the sampling variance for those species (or the mean/pooled variance divided by the species-specific sample size). When we have only 1 sample, of course, the within-species variance and the sampling variance of the mean are the same.

(2) Are the data on a linear or log scale? Usually log-transformation will homogenize the variance. If your within-species variances (not sampling variances, remember) differ 40-fold, then it is probably because the trait varies widely among species as well and there is a relationship between the variance and the mean. Log-transformation will usually get rid of this.

I look forward to hearing more details about this and I will post the solution to the blog if it is likely to be useful for or informative to others.

Thanks! Liam

Dear Liam,

ReplyDeleteJust a small question. To you it might be a very simple question but I am not able to sorted it out and also I may not ask the question correctly but still asking it. I am trying to show the distribution of contineous trait data over the phylogenetics tree in colours. something like,

'tiplabels(pch=14, col=mycol(Trait data), cex=0.8)'

but this gives me colours only at the tips as it should but if I want to colour the entire branch, how can I do it. Something like the background of your webpage with a lot of coloured trees.

Thans

Check out the phytools function contMap and let me know if this is what you have in mind. If you have any difficulty figuring out how to use it, please let me know and I would be happy to help. - Liam

DeleteDear Liam, Thanks for the reply. Yes, I am looking for this kind of function. I tried this function but it is giving me an error, "Error in while (x > trans[i]) { : missing value where TRUE/FALSE needed"

DeleteThough "tiplabels" function is working fine. I must add , before using this function, I am also using the "drop.tip", function as well.

Thanks

Unfortunately, I can't troubleshoot that error unless you send me your input data & tree. In that case, though, I would be happy to tree. - Liam

DeleteDear Liam, thank you very much for this help. Ok, first I will try myself to sort it out if I am not able to plot tree then will surely send my data and tree to you.

DeleteThank you very much once again.

In both cases, I will let you know.

Hey Liam,

ReplyDeleteHave you written any functions, or are you planning to, to simulate not just trait means, but trait variances for species? E..g, add trait variation as a output value for each tip along with the mean in fastBM.

Thanks! Scott

Hi All,

ReplyDeleteI have a quick question: how can you assign states to nodes in a phylogeny, based on reconstructions. So lets say I have the data from my reconstruction, I want to say:

Node 16 <- 0

Node 17 <- 1

Node 18 <- 1

Such that I can list the descendants from any given node, and see if they differ from the node which I started with, hope that makes sense!

Dear Liam,

ReplyDeleteI have used make.simmap to obtain 1000 stochastic maps for a larger clade. It contains two sister-clades. My main interest is to compare transitions for these two sister -clades. describe.simmap gives average values for the entire clade, while, countSimmap provides number of transitions for 1000 trees. Is there a way to obatin these values individually for sub-clades within phytools? or Do i have to run the analyses separately for each sister-clade to obtain these values?

Thank you,

VJ

Hi Liam,

DeleteThought inform you. Have run the analyses on the clades separately to obtain mean of the transitions.

VJ

Hi Liam,

ReplyDeleteI contacted you by mail, but the blog may be a nice channel for ideas exchange as well, I believe*.

I'm writting some code to actually simulate geographic dynamics using this CTMC framework. I implemented my code using geiger, but now that I've delved into phytools documentation and also this blog, I'm quite tempted to re-do my stuff to use phytools capabilities. I like the way you can do it all (simulate, fit and estimate) in a very clean and fast way. Since I'm planning a serious simulation study, this seems rather fit.

Do you think it is possible to incorporate an asymmetric Q, provided the restrictions for detailed balance (upon exponentiation) hold?

Cheers and thanks for providing,

Luiz

*I'm sorry to have posted this outside the general comments section, the first time.

Hi Liam,

ReplyDeleteI am asking this question here in the hopes it might be useful to others. Is it valid to compute partial R^2 values when you have multiple predictors in a PGLS or pGLM situation?

Thanks for any thoughts you might have.

Andrew

Hello dear sir,

ReplyDeleteNice blog you have here.

I'm wondering, do you know of a way of quantifying the amount of variation in reconstructed trait values in ancestral vs. younger nodes (in R), i.e. answering the question: "When in evolutionary history did change occur in said trait?"?

I've reconstructed values and all, I just do not want to have to calculate edge values by hand for a 200spp phylogeny :)

Thanks!

Dear Liam,

ReplyDeleteI am running the phy.cca to command of phytools for my dataset ( which I have sent to you by e-mail. Name: Nachiket Shankar ; Subject: phl.cca)

I constantly get this error - and am unable to figure out the problem!

Error message: Error in solve.default(kronRC, y - D %*% a) :

Lapack routine dgesv: system is exactly singular: U[1,1] = 0

Could you please help me out with this?

Hi Nachi.

DeleteI just ran some of the code you sent me. It looks like you are trying to do CCA with only one x & one y? This is not the purpose of of CCA, which is designed for more than one x & y (essentially a multivariate correlation, see description here). You may just want a regular (phylogenetic) correlation, although it is a little difficult for me to recommend that without more information about your data & problem.

All the best, Liam

This comment has been removed by the author.

ReplyDeleteDear Liam,

ReplyDeleteIs there a way to 'slice' a mutational map into successive time frames (or eras) and tally character state changes within each slice as a unit? I am curious to know if there is a method that allows one to estimate the amount of character change over units (bins) of time.

Best wishes,

Jay

Hi Liam,

ReplyDeleteI am trying to test the effect that constraining the root state has on the number of transitions by setting pi = c(0,1) or even pi = c(.001, .999) but it still results in the root being of state 1 not state 2. Is it possible to force the root to be a particular state?

Thanks!

Mericien

Hi Liam,

ReplyDeleteJust a question about computation of single rate BM in Brownie.lite.

Why did you use optimtimization of the likelihood instead of computing directly the MLE? There is cases where the results could differ?

Many thanks,

Best Regards,

Julien

Hi Liam,

ReplyDeleteI've been using densitymap function to summarize mutational maps made in SIMMAP. I have been unable to obtain the posterior probability values for each branch of the density tree as shown in the plotted tree. Is it possible to extract these values when running densitymap?

this brings me to my second question. I have a continuous trait mapped for the same tree. For each branch of the tree this trait takes a value between 0 and 260 (only integer numbers). Is there a way to estimate the correlation between this trait and the probability values from densitymap? Note that this will be a correlation between values in the internal branches, not with tip values. Is this analogous to a correlation between ancestral states?

thanks for your feedback!

arley

Using the function phylosig for estimating lambda. Is the nsim argument skipped or is there a randomization test for Pagel's lambda?

ReplyDeleteChris - no randomization test for Pagel's lambda. This is just ignored in that case. - Liam

Deletebrownie.lite is throwing me an error and I cannot track down why. When I enter

ReplyDelete" H.l.bm <- brownie.lite(Nym.pruned$phy, Hosts, maxit=1000)"

it runs for 10 sec and then:

Error in 1:m : argument of length 0

Any guidance would be appreciated.

Chris

Hi Chris.

ReplyDeleteNym.pruned$phy is a an object of class "phylo" with a mapped discrete character and Hosts is a vector with names corresponding to the species names in Nym.pruned$phy? If that the case, then let me know and feel free to send me your data (or .RData file) so I can investigate.

Thanks. - Liam

Dear Liam,

ReplyDeleteI just discovered your blog, it's a pleasure!

I have a question, maybe you have an idea?

I try to generate all trees being at RF distance of, let's say, 2 from a given tree. Do you have any idea on how this could be implemented and if this seems something complicated or not?

Cheers,

Damien

Would be interesting to know your thoughts on this matter, Liam!

ReplyDeletehttp://carlboettiger.info/2013/10/11/is-it-time-to-retire-pagels-lambda.html

-Dave

I know this comment is old but I think it's based entirely on a misunderstanding of Lambda.

DeleteIs there a non-parametric version of the phylogenetically corrected t-test?

ReplyDeleteHi Liam,

ReplyDeletethanks for both the great blog and phytools :o)

I was wondering if I might ask you for your help. I am trying to test for phylogenetic signal in my data using Pagel´s lambda (phylosig function). I have the latest version of both R and phytools. I import the phylogeny via read.nexus command. I even sorted the species data so that they are in the same order as phylogeny. However, I end up with this error message every single time:

"some species in x are missing from tree, dropping missing taxa from x"

[1] "some species in tree are missing from x , dropping missing taxa from the tree"

Error in vcv.phylo(tree) : the tree has no branch lengths

In addition: Warning message:

In drop.tip(tree, setdiff(tree$tip.label, names(x))) :

drop all tips of the tree: returning NULL

the same data and phylogeny work just fine in other analyses - PIC, calculation of K through Picante multiPhylosignal.

Do you have any idea what might be the reason? I assume it is some silly mistake I am making, but have hard time figuring it out.

Thanks in advance for any reply,

Katerina

Hi,

DeleteI'm having the same issue with my phylogeny tree, and I was wondering if you figured out what was wrong and how to fix it. My tree does have branch length, but I keep getting that same error when running phylosig.

Thanks.

Onja

Hi Liam or anyone who can help me with this,

ReplyDeleteI'm using the evol.rate.mcmc, minSplit and posterior.evolrate functions to calculate and plot splits in trait evolution rate on a phylogeny. I have no problem with the evol.rate.mcmc and the minSplit functions, but when I use the posterior.evolrate function I the following error pops up:

pevol.flor<-posterior.evolrate(mytree, para.m.flor$bp, m.flor$mcmc, m.flor$tips, showTree=TRUE)

Error in split$node : $ operator is invalid for atomic vectors

I also tried

pevol.flor<-posterior.evolrate(mytree, 0.0001211563, m.flor$mcmc, m.flor$tips, showTree=TRUE)

Error in split$node : $ operator is invalid for atomic vectors

I check whether all object I use as arguments are what they should be (matrix, list, etc, that is all ok). Can you help me please?

Thanks a lot!

Marina

Dear Liam,

ReplyDeleteI am very new to this field and when I am trying to get phylogenetic distances or with other phylogenetic commands I get a similar error as mentioned above by Marina.

Input;

- t1<-read.tree(text=aa)

- sample<-read.table(file="densityplotspecies.txt")

- pd("sample","t1",include.root=TRUE)

(with aa as a phylomatic newick output)

I tryed a lot of different things but I keep ending up with this error:

Error in tree$edge.length : $ operator is invalid for atomic vectors

Could you please help me!

Roos

I'm trying to find a function that returns the node number of the most inclusive clade containing some tips, but not others. For example, for the following tree (((1,2),((3,4),((5,6),((7,8),(9,10)))) I want the most inclusive clade containing tips 7 and 9, but not 3. The answer would be the clade containing 5,6,7, and 8,9,10. Thanks in advance for any help you can provide. Joey

ReplyDeleteNice post. I like it. This is useful for me. I have more knowledge about it at:

ReplyDeletehttp://twistwriter.com/what-are-the-branches-of-biology

Hi Liam,

ReplyDeleteI am using PGLS to examine scaling among morphological characters. I wonder if you have advice on the most appropriate way to test for significant differences between the observed PGLS slope and isometry?

Is it advisable to constrain the slope to isometry and then compare the observed and constrained slopes using a t-test?

Thank you.

This comment has been removed by the author.

ReplyDeleteHello

ReplyDeleteI need to use this function

require(phytools)

packageVersion("phytools")

tree<-pbtree(n=76,scale=1)

X<-fastBM(tree,nsim=2)

phylomorphospace(tree,X,xlab="X1",ylab="X2")

but problem here is the laebelling. I want to label nodes with individuals I have instead of t1,t2,......tn.

Please fix this problem

This comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteI am looking for a way to do a phylogenetically independent PCA that accounts for intraspecific variation within a species. I thought I could indicate members of the same species as a polytomy with branch lengths=0. As a test, I created a tree with one branch that has length=0. I then ran phyl.pca and it crashed R. Any suggestions? There must be a better way to do it.

-Will

Yes, this will not work because the phylogenetic covariance matrix among species in this case will be singular.

DeleteOne tactic is to use species means to get the rotation, and then rotate the individual data using the eigenvectors from the among-species PCA. This allows you to obtain PCA scores from the phylogenetic PCA for all individuals in your analysis, but it does not take into account intraspecific variation in performing the phylogenetic PCA. - Liam

This comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteI have been trying to use ancThresh for a while. Once the previous issues were solved, now I get this,

> mcmc<-ancThresh(tree,X,ngen=1000,control= list(piecol=cols,tipcol="estimated"),label.offset=0.01)

**** NOTE: no sequence provided, column order in x

MCMC starting....

gen 1000

Error in burnin:nrow(mcmc) : argument of length 0

Could you please provide me with some help?

Hi Maria.

DeleteThis is probably because you need to increase ngen above the default (or decrease the sample interval). The default setting for the number of generations of MCMC to run, the sample interval, and the default burn-in are mutually incompatible! Try increasing ngen=100000 and see if that works.

All the best, Liam

Hello Liam,

ReplyDeletecongratulations on your blog! I've been trying to use Rphylip and hit an obstacle:

While phylip is installed and properly executed via the terminal (Ubuntu Linux),

its path cannot be found in R. I observed you use the function findPath() to obtain it, however it throws an error in R:

> findPath()

Error: could not find function "findPath"

Am I doing something wrong? Should the function be there? Any help would be appreciated.

Cheers,

Andreas

Hi Andreas.

DeletefindPath is an internal function in Rphylip, not part of the namespace. To see its guts, you can type:

Rphylip:::findPath

at the R command prompt.

If Rphylip is not working in Linux it probably because it cannot find where you have put the PHYLIP executables. This is not too surprising as findPath only searches in places where Windows & Mac OS users would normally put these files. For any R session you can use the Rphylip function setPath to set the path - and then you will not have to set it again. For more information about how to use setPath type:

?setPath

at the R command prompt.

Let me know if this helps. Sorry for the slow reply!

All the best, Liam

Hey there!

ReplyDeleteI am trying to reconstruct the ancestral state of a discrete genic characteristic across many loci. For each locus I am randomly sampling 100 trees from the posterior, performing 10 stochastic maps for each sample, and then outputting using the describe.simmap function. In my gene trees I have strong prior knowledge that one particular species should be the outgroup. Is there a way to assign an outgroup to the outputted tree using describe.simmap?

While I have your attention, is there a clever way to summarize ancestral state reconstructions across multiple loci? I am having trouble conceptualizing a method to so because the character I am reconstructing is one of the gene and not the species on a whole, and the phylogenys that I am reconstructing on will most likely have slightly different topologies. One summary statistic I can think of is to plot out the distribution of transitions from one state to another. But, that is all I got.

Any help, or pointing in the right direction would be greatly appreciated!

Please ignore my question about exporting a rooted tree using describe.simmap. I added an outgroup prior in BEAST...duh...

DeleteHi, I'm trying to run Pagel's correlation method. However I'm having some trouble. I followed all code posted here (http://blog.phytools.org/2014/12/r-function-for-pagels-1994-correlation.html) and everything goes nicely. But when I use my own data:

ReplyDeletelibrary(phytools)

source("fitPagel.R")

.check.pkg<-phytools:::.check.pkg

m <- read.table("matriz",row.names=1)

tree <- read.nexus("tree.nexus")

x <- m[,1]

y <- m[,2]

names(x) <- rownames(m)

names(y) <- rownames(m)

fit.ape<-fitPagel(tree,x,y)

I get the following error:

Error in matexpo(Q * EL[i]) : NA/NaN/Inf in foreign function call (arg 1)

3 matexpo(Q * EL[i])

2 ace(xy, tree, type = "discrete", model = iQ) at fitPagel.R#30

1 fitPagel(tree, x, y)

Any help would be highly appreciated.

Thanks in advance!

Luis

Hi again!

ReplyDeleteI tried this:

install.packages("matexpo")

Warning in install.packages :

package 'matexpo' is not available (for R version 3.1.2)

perhaps is my R version?

Best!

Luis

Luis.

Deletematexpo is in ape. You could try installing geiger & running method="fitDiscrete". If that doesn't work, I would be willing to try to run your data and see if I can figure out the problem.

- Liam

Dear Liam,

DeleteOk, I'm sending you an email to liam.revell@umb.edu with my data.

Thanks a lot for your help and time!

Luis

Hi, I'm getting the same error. Did you find a solution in the end?

DeleteThanks

Iyar

Hi Liam,

ReplyDeleteI am trying to run the "aov.phylo" function in geiger and am getting the following error: Error in aov.phylo(matrix ~ grp, tree, nsim = 50, test = "Wilks") :

'formula' must be of the form 'dat~group', where 'group' is a named factor vector and 'dat' is a data matrix or named vector.

This is the code I am using:

#setwd

library(geiger)

tree<-read.nexus("tree.nex")

dta<-read.table("namosi.txt", header=TRUE)

matrix<-as.matrix(dta)

group<-as.factor(dta$row.names)

aov.phylo(matrix~group, tree, nsim=50, test="Wilks")

And my data matrix format:

"SLA" "LDMC" "Lf.Size" "Flwr.Size" "Height" "Tri.Den"

"anthropophagorum" 47.55 108.94 227.98 6.11 160 2

"coleoides" 69.3 129.62 313.07 9.71 206.5 0

"multiseptata" 45.19 122.77 262.82 11.59 122.33 1

"vitiensis" 15.9 97.06 737.21 8.16 97.17 0

Any advice you could provide would be greatly appreciated!

Thank you,

Melissa

Hi Liam, thank you very much for sharing all these useful information with the scientific community and for the time and support you devote and provide.

ReplyDeleteI have a couple of questions about the way in which the package caper handle the "relevel" and "anova" function in R. These work fine in other packages such as ape but when i try a PGLS analyses in caper i get some issue. The "relevel" function runs fine but the output table comes out with the original values in terms of intercept of reference. When i ask for an anova table instead i get this:

Error in terms.formula(formula, data = data) :

invalid model formula in ExtractVars

Any idea about what i may be doing wrong? Thank you for your time.

Riccardo

Hi,

ReplyDeleteAs a new user, I need to simulate trees using phytools and wondering what would be the best resource in order to figure out how should be manipulating different tree parameters?

Tnx

Hi Liam, is it possible to estimate ancestral states using a multiple mean OU model in contMap?

ReplyDeleteThanks!

Hi Camila. It is not, but if you have estimated ancestral states from a different function, it is possible to use contMap to plot them. There is an explanation of how to do this here: http://blog.phytools.org/2014/08/droptip-method-for-object-of-class.html. If this is what you want to do, let me know and I may write another post about it or built a function to make it easier. - Liam

DeleteHi, Liam!

ReplyDeleteI'm trying to apply regressions, ANOVA and ANCOVA using the pGLS. When I am using discrete predictor on O-U model (corMartins), the following problem appears for most of the associations I tried to fit:

> pglsModelOU <- gls(continuous.trait ~ discrete.trait, correlation = corMartins(1, phy = crab.tree),

+ data = eubrachy.data, method = "ML")

Error in gls(continuous.trait ~ discrete.trait, correlation = corMartins(1, :

singular convergence (7)

However, If I change the position of my traits (discrete.trait ~ continuous.trait), the problem does not occur anymore. Dos it make sense? Any advice?

Thanks in advance!

Samuel Faria

This comment has been removed by the author.

DeleteHi Liam,

ReplyDeleteThere are several functions available for simulating rate shifts in trees (e.g., SimTree, TESS), but these implement tree-wide shifts, which are somewhat unrealistic, rather than shifts descending from a particular node. Is it possible to implement a more realistic rate shifted tree? Do you have any ideas on how it might be done?

Thanks.

Hi Trevor.

DeleteThis wouldn't be too hard to do; however I'm surprised that it is not out there. Perhaps you should post your query to the R-sig-phylo email list as if it has already been implemented in software, someone on that list will surely know.

All the best, Liam

Hi Liam,

ReplyDeleteI am have issues with some code that can potentially be a bug - some of the colors are printing in the wrong order. I am attempting to plot colors by states on a pie graph for node labels. I used the two codes below. For the second code, if I list the names in alphabetical order of trait, it plots correctly.

On a separate note, I was wondering if there's anyway to use densityMap for more than one state? I have 9 discrete states and a tree with 1600+ tips. The plot with pie charts plotted comes out a bit messy since there are so many nodes. When I plot the tree as a fan it appears a little better, but still a bit cluttered.

Thanks!

cols <- c("brown", "green", "light blue", "blue"); names(habs)<-c("Fossorial", "Terrestrial", "Freshwater", "Marine")

cols2 <- setNames(c("brown", "green", "light blue", "blue"), sort(unique(x)))

Hi Liam,

ReplyDeleteThanks very much for the amazing blog! It has made my brief foray into phylogenetics a lot more bearable!

I have a question about testing significance of the slope for pgls.Ives. I know this has come up before and the suggestion was to constrain the slope to 0 and run a likelihood ratio test. This all seems fine so far. As far as I can tell, you can specify the upper and lower bounds of the slope to be zero in the optim function. However, this then fails due to non-finite difference values, I think in the likelihood function. Is there a way round this? You can set the slope to be close to zero, but not actually zero. All of this could be completely wrong however, so any help would be much appreciated!

Thanks,

Adriana

Hi Liam,

ReplyDeleteI am simulating correlated phylogenetic data by using fastBM and then applying the correlation structure by multiplying the data matrix by the cholesky decomposition of my correlation matrix (M) to generate a new data matrix (newdata). To test that the results make sense, I calculate the phylogenetically independent vcv matrix using the method that you describe in your 2009 paper on phylogenetic size correction and pca. Once I have the vcv matrix, I convert it to a correlation matrix (B) and multiply newdata by inverse of the cholesky decomposition of B.

This produces a data set very close to the original data simulated under fastBM so long as the number of characters is less than the number of taxa. If it's more, I get an error along the lines of "the leading minor of order 101 is not positive definite." The reason that I've determined is that the number of positive eigenvalues of the phylogenetically independent vcv is equal to the number of taxa minus one. Have you ever encountered this problem before? Is there a good solution?

Best regards,

-Will Gelnaw

Liam,

ReplyDeleteI'm trying to test if my trait of study has been canalized over time. I don't just want to know if the breadth of the trait has been reduced, but I also need to know if the current trait values falls within the ancestral values.

So far, my best idea has been to separately reconstruct max and min values for the trait and use that to estimate ancestral breadth. It might be a little tricky though because its a multivariate trait. I've been thinking that I would have to separately reconstruct the max and min for each PC axis.

If this method is flawed or there's already a way to test this, can you let me know?

Thank you for all the help you've already given me through this blog. You are a hero of grad students!

-Tom

Hi Liam,

ReplyDeleteThanks for providing such great resources. I have a couple of questions that I cannot seem to find answers for (and hope I have not missed anything obvious).

Is it possible to outline branches in plotSimmap? I would like to use black, grey and white for simplicity, but only if I can see the white branches...

Also, I notice that the colour at the nodes (vertical branches on a square phylogeny) are the colour of just one of the tipwards branches (the bottom branch). Is it possible for the nodes to be coloured equally by the two branches?

Thanks again,

Elizabeth

Hi Elizabeth. I have now addressed this question here. Let me know in the comments for that post if this is what you have in mind.

DeleteAll the best, Liam

This comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteI'd like to plot/analyse a phyl.pca object with existing packages like e.g. with ggbiplot and thus am in the situation where I'd like to convert it into a prcomp or princomp object with

sdev

x

rotation

scale

center

attributes filled. I'm calling phyl.pca as follows:

pc.pca<-phyl.pca(tree,df,method="BM",mode="corr")

I compute/fill the necessary attributes as:

sdev=sqrt(diag(pc.pca$Eval))

x=pc.pca$S

rotation=pc.pca$L

scale=FALSE

center=FALSE

But somehow this doesn't seem to be correct. The sd seems to be wrong.

Also I'd like to know how to obtain scale and center values...

Lastly, I'd like to project supplementary variables onto the phyl.pc's coordinate space by cor(newvar.df,pc.pca$S), correct?

Sorry for asking so many, hopefully not too stupid questions.

Best,

Daniel

Hi Liam,

ReplyDeleteI am looking for a function to calculate the support of nodes in consensus trees (e.g. Bremer support) but I found none. Do you know if any package includes that? Ape, which has the function consensus() does not.

Do you plan to include any of these things in the future in phytools?

Many thanks!

Borja

Hi Borja. You can get bootstrap values for a set of trees from which we have computed a consensus tree using the function prop.part from the ape package.

DeleteWith regard to Bremer support - there may be some way to compute these values using the phangorn package. I would email Klaus, the author of that package, or R-sig-phylo.

Borja, it was great to meet you in TucumÃ¡n. Hopefully our paths cross again in the future!

All the best, Liam

Thanks! I will try that.

DeleteCheers,

Borja.

Dear Liam,

ReplyDeleteI used the function phyl.pca with method="lambda" to perform a pPCA. The optimal lambda is less than 1 (0.747, precisely). How can I test if this is significantly different than 1 for the same dataset and tree?

Thank you very much!

Alessio

Hey Liam, thank you so much for solving other people's problems. I got some of your tips by Google and now I can calculate K values & its p-value after three weeks dealing with data format (rooted & dichotomous) and so on. Keep doing good things. Cheers.

ReplyDeleteHi,

ReplyDeleteI am trying to generate a list of apomorphic discrete character changes for all nodes in a tree in R. This is the kind of thing that nearly every piece of phylogenetics software outside of R can do, but I need to extract data from such a list for a large number of trees in a bayesian sample. Any suggestions?

JS

This comment has been removed by the author.

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteIt may sounds like a very simple question but I need to simulate birth-death trees by means of the following code. It does not always return wanted number of trees so I was wondering how can I change the code to make it conditioned on the number of returned trees with n extant species.

sim_trees<- pbtree(b=0.5, d=0.2, n=64, t=NULL, scale=NULL, nsim=50, type="continuous", extant.only=TRUE)

Cheers

Hi Liam,

ReplyDeleteWhile counting the number of clades in a set of "rooted" bootstrap trees using prop.clades(), I see, apparently erroneous, results. I have posted the question on stackoverflow (the link is http://stackoverflow.com/questions/33811950/how-to-properly-use-prop-part-and-prop-clades-functions-in-ape-for-bootstrap).

Can you please quickly look at the post to see if I am calling the wrong method or its just that my interpretation is wrong.

Thanks in advance,

Ikram

Karolinska Institute, Stockholm

Hi Liam,

ReplyDeleteIm working right now on a phylogeny using the picante function "match.phylo.data", and i am getting again and again this error

> mdata<-match.phylo.data(CeroxTree, Biox)

[1] "Dropping taxa from the data because they are not present in the phylogeny:"

[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"

[20] "20" "21" "22" "23" "24" "25" "26" "27" "28"

[1] "Dropping tips from the tree because they are not present in the data:"

[1] "Aphandra_natalia" "Ammandra_decasperma"

[3] "Phytelephas_tumacana" "Phytelephas_aequatorialis"

[5] "Phytelephas_macrocarpa" "Phytelephas_tenuicaulis"

[7] "Ravenea_sambiranensis" "Ravenea_krociana"

[9] "Ravenea_hildebrandtii" "Ravenea_moorei"

[11] "Ravenea_robustior" "Ravenea_rivularis"

[13] "Ravenea_albicans" "Ravenea_xerophila"

[15] "Ravenea_louvelii" "Oraniopsis_appendiculata"

[17] "Juania_australis" "Ceroxylon_ventricosum"

[19] "Ceroxylon_amazonicum" "Ceroxylon_echinulatum"

[21] "Ceroxylon_parvifrons" "Ceroxylon_quindiuense"

[23] "Ceroxylon_parvum" "Ceroxylon_vogelianum"

[25] "Pseudophoenix_vinifera" "Pseudophoenix_sargentii_var_navassana"

[27] "Pseudophoenix_lediniana" "Pseudophoenix_ekmanii"

Warning message:

In drop.tip(phy, setdiff(phytaxa, datataxa)) :

drop all tips of the tree: returning NULL

The point is that the taxa names of the phylogeny and the dataset dont match, but I cant understand why. Even if in the error the taxa names of the dataset appear to be number, in the real data set they are not. I have checked the taxa names of the phylogeny and the data set and they do match. So I dont understand why instead of the real taxa names that I have in my dataset, the function uses the numbers of the rows of the dataset (these are not a column). Am I clear?

Thanks!!!

This comment has been removed by the author.

ReplyDeleteDear Liam,

ReplyDeleteI have encountered an issue in the behavior of tiplabels when added to a densityMap plotted facing leftwards. When I try to add pie-style tip labels, I get the following error:

> tiplabels(pie=to.matrix(Short[HeightDMap$tree$tip.label],sort(unique(Short))),col=cols2)

Error in symbols(xpos, ypos, circles = radius, inches = FALSE, add = TRUE, :

invalid symbol parameter

The command I am using to plot pie tip states works perfectly when the density map is plotted to the right. Using thermo instead of pie with an identical argument works fine facing left as well. I tried it with the original ML tree too, and pie worked fine facing left as well. I haven't been able to find any mention of this specific scenario here or elsewhere. Any suggestions?

Margaret

p.s. I'm still learning to use phytools, but so far it's impressively powerful!

Hi Liam,

ReplyDeleteYour blog is very helpfull.

I was wondering if you can advice how to transfer bootsrtap values from different topologies that contain difeerent nodes but the same species?

Many thanks,

Ale

Hi Liam

ReplyDeleteI am a PhD student, my project about behavioral evolution study, I need to ask, do you have any training course about using R software for reconstructing phylogenetic tree?

Hi Dr. Revell,

ReplyDeleteI'm having a hard time figuring out how to use make.simmap in phytools 0.5.10. I'm trying to run it like this:

map<-make.simmap(tree=tr, x=chars, model='ARD', nsim=200)

in which chars is a matrix with taxon names (same as tr's tip labels) as row.names and tip states (single-character strings) in a column (the only column).

But then I get this error message:

Error in rowSums(xx) : 'x' must be an array of at least two dimensions

I suspect that the problem is in the format of x. I've also tried inputting x as a named vector, to no success.

I realize this is probably a stupid question, but that's because I'm now to phytools, and to R, and to comparative methods in general.

Thanks a lot,

Hi Liam,

ReplyDeleteI have been using your function 'treeSlice' to section a tree at different time intervals (5mya, 10mya, etc). However, the subtrees are collapsed onto the crown, so that a subtree cut at 5mya will have an age of 2my, because the first 3my is a single branch. Do you know of a way to either prevent this from happening (so the subtrees are not collapsed onto the crown) or, after the fact, append a branch at the root (to "restore" the stem)?

Thanks in advance.

Hi Trevor.

DeleteThe stem is not gone - just stored in the element root.edge.

More details on the blog here.

Hope this helps clarify things. - Liam

Hi liam,

ReplyDeleteI am use cophylo to compare one hierarchical clustering with one phylogenetic tree. I wonder if there is any way to color the tip (in my case species name)?

Thanks，

Chen

I have a question about the heatmap associated with my phylogeny. When part of my tree is collapsed, can the heatmap values in the collapsed part be accumulated?

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteDear Liam,

ReplyDeleteI am using your Phytools to conduct the Ancestral state reconstruction for my discrete characters. the following is the commands:

> tree <- read.tree("example.trees")

> tree

Phylogenetic tree with 30 tips and 29 internal nodes.

Tip labels:

Slatifolia, Svulgaris, Sturkerstanica, 2991, OSR, MH_A, ...

Rooted; includes branch lengths.

> character <- read.delim("30Samples.chromosome.name.txt", header=T, row.names=1)

> setdiff(tree$tip.label, row.names(character))

character(0)

> B29_O41 <- character$B29_O41

> names(B29_O41) <- row.names(character)

> B29_O41

Sundulata 1672 6184 BRP MH_B MH_E

1 1 1 1 1 1

MH_H MH_J MH_K MH_L 2121 OSR

1 1 1 1 0 1

MH_F MH_D BDA BWT NOU OPL

1 1 1 1 1 0

0913 1661 2991 MH_A MH_I 0147

1 1 1 1 1 1

MH_C MH_G Sturkerstanica Svulgaris Slatifolia Dianthus

1 1 ? 0 0 0

Levels: ? 0 1

> plotTree(tree,fsize=0.8,ftype="i", type="phylogram")

> cols<-setNames(palette()[1:length(unique(B29_O41))],sort(unique(B29_O41)))

> tiplabels(pie=to.matrix(B29_O41,sort(unique(B29_O41))),piecol=cols,cex=0.2)

> add.simmap.legend(colors=cols,prompt=FALSE,x=0.9*par()$usr[1], y=2,fsize=0.8)

> fitER<-ace(B29_O41,tree,model="ER",type="discrete")

> likelihood <- round(fitER$lik.anc,3

+ )

> nodelabels(node=1:tree$Nnode+Ntip(tree),

+ pie=fitER$lik.anc,piecol=cols,cex=0.5)

> tiplabels(pie=to.matrix(B29_O41,sort(unique(B29_O41))),piecol=cols,cex=0.2)

It works well, but there are some mismatches between the real characters and the tips states on tree.

from the picture, some mismatches could be find:

For our BRP as "1" character, but it labeled as "?" in tree,

For MH_I, MH_C were as "1" character, but label as "0";

For Svulgaris, Slatifolia were as "0", but labeled as "1";

For Sturkerstanica as "?". but it was labeled as "1"

I have 22 characters, I try the other characters, they also show this mismatches.

Do you know what's the problem for those? any suggestion on this?

thanks for your time and help.

Best

Zhiqiang Wu

Thanks for your suggestion on the email. It works now.

Deletezhiqiang wu

Hi Liam,

ReplyDeleteAfter reconstruction of ancestral characters by rerootingMethod and make.simmap, I used nodelabels to plot pie charts of reconstructed characters onto internal nodes. However, I found the default size of the piecharts are too big for my tree. May I ask if there's any way to adjust the size of piechart in phytools?

Thanks,

Cheng-Chiang

Never mind. I just solved this issue by adjusting nodelabels with smaller cex.

DeleteDear Liam,

ReplyDeleteI have produced LTT plots for a number of clades, but the phylogenies they are based on are not scaled to real time (instead they are ultrametric trees scaled to substitutions/site).

This makes the relative time x axis on the LTT plots not very useful to me. If you knew of any way to manually callibrate real time scales directly to the LTT plots, I would be very interested. I have root divergence dates from the literature that could be used in this, but I'm not sure how to add these to the plots.

Thanks very much,

Zach Gayk

Hi Liam,

ReplyDeleteI did reconstructions of ancestral discretely-valued states under ARD and SYM models (ARD was suggested the best model according to comparison by using geiger, and SYM the second). Likelihood values for the ancestral states at a few internal nodes were negative, and thus failed the plotting of nodal piecharts. Reconstruction under ER generated no negative likelihood values.

Could you provide fix to this problem?

Thanks,

Cheng-Chiang

No, but my guess is that they are zero and are showing up as very slightly less than zero due to numerical precision (limitations in how precisely a computer can store a number). You could try rounding the object to (say) 8 digits, i.e.:

Deletefit<-ace(....)

fit$lik.anc<-round(fit$lik.anc,8)

## or:

fit<-rerootingMethod(....)

fit$marginal.anc<-round(fit$marginal.anc,8)

If this doesn't have the effect of zeroing the negative values, then please send me your saved workspace & I will see if I can figure it out. - Liam

This comment has been removed by the author.

ReplyDeleteHi, Liam

ReplyDeleteHow can I test phylogenetic signal from geometric morphometrics data? I have the tree of 14 spp. of parasites and morphological traits..there are examples of these?? Tkanks, Abril

Hi Abril.

DeleteYou may want to look into the research of Dean Adams and others on this topic.

Good luck! Liam

Hi Liam

ReplyDeleteI have a quick and very basic question: is there any function in phytools to fit models of continuous traits evolution (something analogous to fitContinous in geiger) that admits missing data in the traits matrix?

Thanks in advance and congrats for this blog!

Susana

Liam,

ReplyDeleteThis blog is awesome!

I'm having an issue with ladderize {ape} and Ancestors {phangorn} that I'm hoping you can help me out with.

library(ape)

library(phytools)

library(phangorn)

#make tree with 5 tips and 4 internal nodes

tree<-read.tree(text="(t1:228,((t2:57.065588,(t3:8.68127,t4:8.68127)100:48.384317)99:19.099685,t5:76.165273)100:151.834727);")

tree

is.ultrametric(tree) #its ultrametric

plot(tree) #works

plot(ladderize(tree)) #works

#add a tip to an internal branch above the node shared by t2 and t3

x<-match("t2",tree$tip.label)

y<-match("t3",tree$tip.label)

a<-Ancestors(tree,x)

b<-Ancestors(tree,y)

z<-a%in%b

node<-a[min(which(z))]

position<-0.5 * tree$edge.length[which(tree$edge[,2]==node)]

new.tree<-bind.tip(tree,"t6",where=node, position=position)

plot(new.tree) #works

plot(ladderize(new.tree)) #error about non-splitting nodes

plot(ladderize(reorder(new.tree))) #still getting error about non-splitting

plot(ladderize(collapse.singles(new.tree))) #still getting error about non-splitting

#plot(collapse.singles(ladderize(new.tree))) #crashes R

tree$edge

ladderize(tree)$edge #name number of rows

new.tree$edge

ladderize(new.tree)$edge #missing parts?

I've seen comments about phangorn::midpoint messing up ladderize, but it looked like the reorder function would fix it, which doesn't seem to be the case here. Any info would be great Liam, Thanks!

-Travis

Hi Travis.

DeleteI recommend trying:

tree<-untangle(tree,"read.tree")

instead of reorder. It can help fix badly configured trees.

Good luck. Liam

That worked Liam!

DeleteThanks!

This comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteI am trying to do phylogeography using make.simmap to estimate ancestral states with nsim=100. I am concerned about that I have biased sampling for some of the groups and how can this affect the results. However, I see the object $counts created after running make.simmap with nsim contains a first column (N) with a different number per simulation. I would like to know what is this function really doing during the simulations, is it sampling randomly tips from the tree each iteration independently of their prior group or is it sampling taking into account the number of tips in each group (so, balancing number of tips from each group)?

Many thanks for your help,

Best,

Leo

Hi Liam,

ReplyDeleteI am trying to do a simple pPCA with a molecular phylogeny I made for 13 species. I have morphometric data for 6 morphologies on each specimen (n=2310). I have 2 questions:

Q1) in my X matrix for the phyl.pca() function am I just using the average measurement for each morphometric?

Q2) I keep getting the error: Error in invC%*% X : requires numeric/complex matrix/vector arguments.

When I convert X from a data frame to a matrix it changes to characters from numeric and I'm wondering if that has something to do with this problem?

Please help :S

M

Hi Liam,

ReplyDeleteThanks for a super helpful blog! One question- I've noticed a few papers, when performing an ancestral state reconstruction, saying they only set nodes "with significant support (i.e., a single state that was at least 7.39 times more likely than the next most likely state; Pagel 1999)."

How is this 7.39 value calculated and where does 7.39 come from?

Thanks!

Erik

Hi Liam!

ReplyDeleteThanks so much for this blog, it's helped me immensely.

I need to add a sister taxa to a nexus tree I have in R and have pruned.

I followed your tutorial on adding sister taxa (http://blog.phytools.org/2015/08/adding-tip-to-sister-taxon-on-tree.html) but before I can do it, I need to convert the tree to a phylo object.

When I try to do that, R gives me this message:

Error in UseMethod("as.phylo") : no applicable method for 'as.phylo' applied to an object of class "character"

Any idea why?

Thanks so much!

Hi Liam,

ReplyDeleteThanks very much for a fantastic blog! It has really helped me with my research.

I have a quick question regarding the (potential) double use of phylogenetic correction. We have phenotypic traits (morphological), and functional traits (bite force). We have used a phylogenetic PCA to obtain phylogenetically corrected PC scores of the morphological traits. We would like to regress the first PC axis scores against the bite force. Would it be statistically correct to use the phyl.PC scores and the bite force values in a PGLS? We are concerned that phylogeny would be corrected for twice by doing the stats in this way. Should we use traditional PC scores and bite force values, and then do a PGLS?

Thanks very much,

Shelley

Hi Liam,

ReplyDeleteI have been doing some simple tip additions using the add.species.to.genus function looped across a multiphylo using a list of taxa with where="random". However, every 15 to 20 trees, an error is thrown that a tree is not ultrametric. Inspection of the problem trees shows that occasionally, a tip will be added with a branch length longer than the remaining length of the tree. This of course make the tree no longer ultrametric and causes the error to be thrown when my code tries to add the next tip. I am wondering if you have any thoughts on why this might be occurring, or how I might try to solve it. Is there a minimum branch length that must be added using this function?

Any suggestions much appreciated!

Erik

Hi Erik. Sounds like a bug. Can you send me a fully reproducible example so that I can look into it? Thanks! Liam

DeleteHej Liam

ReplyDeleteThanks for the great resources that you provide. However, I am running into a problem with the midpoint.root() function for my tree. I get the error message > Error in phy$edge[, 2] : incorrect number of dimensions

I was able to narrow done the source of the error to the splitTree function that is called by the reroot function which is called by the midpoint.root function but I didn't get any further.

My Tree is generated by FastTree and made ultra metric with PATHd8 (although the same error occurs before that step).

I paste the output of str(TREE):

> str(TREE)

List of 5

$ edge : int [1:2221, 1:2] 1113 1114 1115 1115 1116 1116 1117 1118 1118 1119 ...

$ Nnode : int 1110

$ tip.label : chr [1:1112] (cut out)

$ edge.length: num [1:2221] 0 0 1 0.00153 0.99848 ...

$ node.label : chr [1:1110] "" "0.000" "0.911" "0.936" ...

- attr(*, "class")= chr "phylo"

- attr(*, "order")= chr "cladewise"

Would you have any idea what could cause the error and what could be done to resolve it?

best

Fabian

Hey again. I dogged a bit deeper and the problem seems to be that splitTree() calls extract.clade() and in extract.clade, anc is found to be node 1117 and the smallest of 7 next.anc is node 1. Therefore *keep* evaluates to *start* + 0:0 and hence just *start* of length == 1. The the edge is updated to phy$edge <- phy$edge[keep, ] which evaluates to an one-dimensional numeric vector of length == 2 wherefore TIPS <- phy$edge[, 2] throws the error above.

ReplyDeleteYet, although I understand what happens. I don't really understand why it happens and what I could do to fix it. Any help would be greatly appreciated!

/Fabian

Hi Fabian, I'm getting the same error message when using splitTree. I'm wondering if you found a solution (or a work-around) to this problem?

DeleteBest,

-Evan

Hi Liam,

ReplyDeleteGreat package. I am trying to add a factor (binary with NAs) to a contMap before the species names, like a coloured square or something similar. Is there any easy way to do this?

Cheers,

Ben

Thank you for a fantastic toolset. In some tree-building programs (like RAxML), samples with identical seq. data are removed from analysis. I'm wondering then if it's possible to label tips with multiple sample names. I'm interested in using your co-phylogeny tools to examine consistency between different trees based on plasmids from the same bacterial sample, however, I want to make sure the same sample names are included in each of the trees, if that makes sense.

ReplyDeleteThank you!

Hello Dr Revell

ReplyDeleteI am really battling with phytools as i am very new to the program. I need to get the mean values at each of the nodes in my tree. I have run ace and succeeded but i am battling to get the means for the nodes.

Please can you help.

Many thanks

Dana Drimml

Dear Liam

ReplyDeleteI also do have (maybe) a silly question. Is it possible and useful to rotate the loadings of the pPCA with varimax? And how would I do this?

Best, Andrea

Hello Liam,

ReplyDeleteI am using phytools to compare rates of speciation between species with different phenotypic traits using the Threshold Model with discrete charecters which are coded for in binary in my matrix. I am getting the problem of "subscript out of bounds" and I am not sure how to fix it. My tree is in nexus format, with 106 tips. The tip labels have been changed to numbers (I guessed from 1 - 106 according to their order in my input file?). My matrix is 3 columns, first represents the species number, the second is if they represent one trait and the second is if they represent the other. This matrix is 106 columns too. A species has to be one of the traits but cannot be both. I wondered if you are able to suggest a reason why the subscript is out of bounds? Many thanks,

Josie

106 rows*** in the matrix

DeleteThis comment has been removed by the author.

ReplyDeleteHello Dr Revell.

ReplyDeleteIt's possible to do a graphic with contMap but specifying the model OU?

Thank you very much.

Diana

Thanks for a great blog! I am trying to estimate discrete ancestral states on a tree and when I use ace with the ARD model:

ReplyDeletefitARD<-ace(region,tree,model="ARD",type="discrete", use.eigen=TRUE)

I get the error: In nlminb(rep(ip, length.out = np), function(p) dev(p), lower = rep(0, :

imaginary parts discarded in coercion.

I also get a rate estimates of 0 for several of the rates that do not seem to be 0. I am wondering how to address this error?

Thank you!

This comment has been removed by a blog administrator.

ReplyDeleteHi Liam,

ReplyDeleteThanks for the great resource! I have a question that isn’t necessarily related to phytools, but phylogenetics in R in general. I’m wondering if you have some insight and potentially a phytools-based solution.

I’ve been having some trouble working in R with unrooted trees (with bootstrap values) inferred with RAxML like this one:

library(ape)

tr<-read.tree(text="((Csativa__XP_010494053:0.00000095723310491617,Csativa__XP_010494052:0.00053878119404546937)100:0.01523094808391222549,((Crub__Carubv10003325m:0.00679395705472142161,Cgrand__Cagra.1690s0002:0.00757370677157680802)100:0.02821026246427783823,((Esal__Thhalv10003818m:0.06818044906713292830,Bstri__Bostr.29827s0022:0.01647969550975519892)71:0.00657376872062076165,(Alyr__16047700:0.02229957218468864300,Athal__AT5G27950:0.03828732716005919801)100:0.02194602478735338535)86:0.00401593635888700638)100:0.01892216227605074755,Csativa__XP_010455157:0.01530003331172456044);")

The problem is that when I root this tree by an outgroup and display (like below), I find that the bootstap scores have been moved around and one of the nodes has a blank value for a bootstrap score.

rooted_tr<-root(tr, "Esal__Thhalv10003818m", resolve.root=TRUE)

plot(rooted_tr)

nodelabels(rooted_tr$node.label)

I believe this has to do with the fact that Ape (and other R packages and tree viewers) treat support values as pertaining to nodes on the tree rather than internal edges. This problem is described in: L. Czech and A. Stamatakis, “Do Phylogenetic Tree Viewers correctly display Support Values?,” bioRxiv, p. 35360, 2015.

I found a work-around that I think will work for me, which is to plot the rooted tree but display the node labels from the unrooted tree (see below).

plot(rooted_tr)

nodelabels(tr$node.label)

I'm just wondering if you have any ideas for avoiding this quirk when rooting trees in R.

Thanks!

-Evan

Hi Lima,

ReplyDeleteThanks for creating such a great package and blog. I'm having a problem removing tip.labels from a countMap plot. I have been using contMap with plotTree.wBars but I've decided I no longer need the bars and would just like the contMap but without labeled tips, which I cant seem to do? im' ploting a 450 tip tree as a "fan". I have made a work around buy still using plotTree.wBars and making x = 0, and border="white", which is fine but I thought there might be a more elegant way?

Best regards

Matt Larcombe

Hi Matt.

DeleteTry setting the option ftype="off".

All the best, Liam

Hi,

ReplyDeleteI have tried to build a SIMMAP null model. So, I have shuffled the tips (discrete states) from original tree and re-ran a SIMMAP (make.simmap function) to store the ages of state transitions . When I run make.simmap using original tree, everything works well. However, when I shuffle the tips, some estimations last forever. Are there some combinantions unfeaseble to reach a stable solution? I guess sometinhg happens with the estimated Q matrix. I would appreciate a lot any advisement.

#Original tree

map=make.simmap(tree,state,nsim=1000,model="ER")

#Null model

map_rand=list()

for(i in 1:1000){

state_rand=sample(state,length(state),replace=F)

map_rand[[i]]=make.simmap(tree,state_rand,nsim=1,model="ER")

}

When the values estimated for Q matrix are very extreme, the function runs forever. For example, I waited 24h with no solution for this Q matrix:

make.simmap is sampling character histories conditioned on the transition matrix

Q =

1 2 3

1 -5512.302 2756.151 2756.151

2 2756.151 -5512.302 2756.151

3 2756.151 2756.151 -5512.302

(estimated using likelihood);

and (mean) root node prior probabilities

pi =

1 2 3

0.3333333 0.3333333 0.3333333

Thanks in advance. Gustavo

Hi Liam (and all),

ReplyDeleteIs there a blog post / function capable of combining the functionality of contmap and dottree? I'm looking to plot a continuous character with a color ramp on the tree with a discrete character dot next to the tips of the tree. If I there is a post on this I have missed it in my search and would appreciate being pointed in the right direction if it is out there.

Thank you,

Gavin

Hi Gavin. I just posted a solution here. Let me know if it is helpful. - Liam

DeleteHi Liam and Matt,

DeleteThank you for your help. The new blog post is extremely helpful and exactly what I was trying to do but couldn't figure out.

Thank you,

Gavin

This comment has been removed by the author.

ReplyDeleteJust removed my previous suggestion based on plotTree.wBars as its more complicated in dotTree, see Liam's excellent post above

ReplyDeleteThis comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteI'm having issues with using the make.simmap function.

I'm trying to set the prior distribution on the root node using the following arguments:

make.simmap(tree=filostTree, x=data.diet, model="ER", nsim=100, pi=c(0,0,0,1,0,0,0,0))

and I had this Error message:

Error in getPars(bt, xx, model, Q = NULL, tree, tol, m, pi = pi, ...) :

formal argument "pi" matched by multiple actual arguments

Could you help me to fix this problem?

Thanks a lot for your attention!

Hello Liam,

ReplyDeleteI really enjoy the plotBranchbyTrait function. I'm plotting substitution rates and the color scale really makes the long branches pop.

Everything was going great for a few weeks, and now with the same trees I used earlier I'm no longer able to get the scale bar to add. Oddly, I can get it to add with your demo example, so I guess it's something with my trees/data, although the same script worked fine a few weeks back. It plots my tree with lovely colors, but never asks about the scale bar.

My basic pipeline is this:

1) Read in tree with branch lengths

2) Create a branch length variable

3) Plot with plotBranchbyTrait calling the tree and branch length variable, method="edges"

With no error messages or warnings, I'm lost. Any ideas for troubleshooting?

Thank you very much for your time and contribution to the R world!

Bob

Hard to diagnose without the example in hand. If you save your workspace & Rhistory & email them to me, I can try to reproduce the error & get to the bottom of it. Liam

DeleteThis comment has been removed by the author.

ReplyDeleteHi Liam,

ReplyDeleteis there a function in Phytools similar to "plotRateThroughTime" or "getRateThroughTimeMatrix" of the package "BAMMtools"? These methods allow to plot evolutionary rates through time, e.g., http://lukejharmon.github.io/ilhabela/instruction/2015/07/02/diversification-analysis-bamm-rpanda/ (see the "Rate through time plots"). Your rates (or any other continuous data) are associated with the edges of your tree and plotted against time.

Unfortunately, the BAMMTools methods only work with bammdata objects (I'm also not sure, whether they can handle non-ultrametric trees). I was just wondering, whether phytools offered a similar option for more generic data (tree + vector or your data)?

After all, using "plotBranchbyTrait" of the "phytools" package already enables us to plot continuous values (as colours) onto the edges of our phylogeny. The generated object must therefore already contain the association between edges and our continuous data. Is there an easy way to plot our data against time (or any other branch length unit) as in the examples mentioned.

See Cooney et al. (2017, fig. 2c) for another example of these plots:

http://www.nature.com/nature/journal/v542/n7641/full/nature21074.html

Thank you so much!

This comment has been removed by the author.

ReplyDeleteDear Liam, i have a question for you;

ReplyDeleteDo you know any R package that can run ancestral state characters reconstructions for continuous character under linear (Wagner) Parsimony?

Thanks in advance !

Andres. If any package can do this, it would probably be phangorn. I would contact the developer, Klaus Schliep (who works in my lab), if you have any questions.

DeleteLiam, thanks for your quick answer. Unfortunately, if i am not mistaken, phangorn package uses molecular data (dna sequences) instead of continuous morphological data. Or, i´m missing something...

ReplyDeleteLiam, thanks for your quick answer. Unfortunately, if i am not mistaken, phangorn package uses molecular data (dna sequences) instead of continuous morphological data. Or, i´m missing something...

DeleteHi Andres. phangorn can analyze any type of discrete character through the data type "USER", not just molecular data. I'm not sure how Wagner parsimony is defined for continuous traits. It is possible this is equivalent to what was once called 'linear parsimony.' Linear parsimony, if I remember correctly, involved identifying the set of ancestral states with the minimum sum of absolute values of changes. There is a related method called 'squared-change parsimony.' This method involves minimizing the sum of squared changes; however if we take edge lengths of the tree into account we have 'weighted squared change parsimony' - which turns out to give us the same solution as the ML estimates under a Brownian assumption. I'm not aware of any similar justification for linear parsimony, nor am I aware of an implementation of linear parsimony in R - although it would not be hard. - Liam

DeleteThis comment has been removed by the author.

ReplyDeleteDear Liam,

ReplyDeleteI was wondering how I can exclude the past few million years of branches from my time tree. I have a phylogeny that began diversifying around 40MYA and I wanted to see how the gamma statistic value changes when I exclude recent speciation events.

I tried searching for this, but may be I am not searching by the right key word.

Regards,

Aparna

Hi Aparna.

DeleteThis is now in phytools on GitHub. More details here: http://blog.phytools.org/2017/04/update-to-treeslice-to-crop-tipward-end.html. - Liam

Thanks a lot Liam! Will try it out.

ReplyDeleteAparna

Dear Liam,

ReplyDeleteI wanted to map the CI on the timed tree after pruning it. To do this, how can I get the matrix of CI from a posterior distribution of trees (empirical data)?

Thanks,

Aparna

This comment has been removed by the author.

ReplyDeleteDear Liam,

ReplyDeleteI have encountered an issue when I wanted to plot a phylogenetic tree with branched colored by values for a quantitative trait and I'm hoping you can help me out with.

The problem is that the values of trait is assigned randomly to the species so the tree does not represent the actual data. Here is my code:

library(phytools); library(picante)

TREE.R20120829<-"(((((((((((((((((((((((Viola_riviniana,Viola_americana,Viola_pedatifida)Viola,(((((((((((((((((((((Trifolium_repens)trifolium))))))))irlc)))))))))papilionoideae))))fabaceae))fabales)))fabids))rosids),(((((((((((((((((((((Verbena_stricta)Verbena)verbenaceae)))),((Veronica_montana)Veronica)scrophulariaceae))))))lamiales)))lamiids,((((((((((((((((((((((((((((((((((Artemisia_dracunculus,Artemisia_frigida,Artemisia_giraldii,Artemisia_sacrorum,Artemisia_scoparia,Artemisia_tridentata)artemisia))))))))))))))))))))))))))asteraceae)))))asterales))campanulids))ericales_to_asterales)asterids)))))core_eudicots)trochodendrales_to_asterales)sabiales_to_asterales)eudicots)ceratophyllales_and_eudicots,(((((((((((((((((((((((((Zea_mays)zea)))))))pacc,(((((((((((Triticum_aestivum)triticum,(Bromus_briziformis,Bromus_inermis,Bromus_tectorum)bromus))))))))))bep))),(Festuca_arundinacea,Festuca_idahoensis,Festuca_ovina,Festuca_rubra)Festuca,(Agropyron_cristatum,Agropyron_desertorum)Agropyron)poaceae))))))poales)commelinids))))))monocots)poales_to_asterales)magnoliales_to_asterales)austrobaileyales_to_asterales)nymphaeales_to_asterales)angiosperms,((((((Pinus_massoniana,Pinus_sylvestris,Pinus_taeda)Pinus)pinaceae)pinales)))gymnosperms)seedplants)euphyllophyte;"

phy.R20120829<-read.newick(text=TREE.R20120829)

phy.R20120829<-collapse.singles(phy.R20120829)

traits <- read.table("D:\\Gordon\\Review\\R codes and files\\test.csv",header=TRUE,sep=",", quote="", row.names = 1)

traits.precision <- traits[phy.R20120829$tip.label, ]

phy.R20120829 <- compute.brlen(phy.R20120829)

a<-matrix(traits$Precision, dimnames=list(rownames(traits)))

x<-fastBM(phy.R20120829)

b<-(cbind(x,a))[,-1]

dev.new()

plotBranchbyTrait(phy.R20120829,b, mode="edge")

This is the matrix which is correct:

> a

[,1]

Pinus_taeda 0.31310231

Pinus_sylvestris 0.67390947

Pinus_massoniana 0.91629073

Agropyron_desertorum 0.47677369

Agropyron_cristatum 0.50946877

Festuca_rubra 0.57587578

Festuca_ovina 1.06745099

Festuca_idahoensis -0.46081204

Festuca_arundinacea 0.14310084

Bromus_tectorum -0.04652002

Bromus_inermis 0.15016184

Bromus_briziformis -0.05884050

Triticum_aestivum 1.04949307

Zea_mays 0.42532756

Artemisia_tridentata 0.71785864

Artemisia_scoparia 1.33913581

Artemisia_sacrorum 1.44169605

Artemisia_giraldii 1.39847454

Artemisia_frigida 1.50407740

Artemisia_dracunculus 0.27002714

Veronica_montana 0.14841421

Verbena_stricta 2.84354046

Trifolium_repens -0.16251893

Viola_pedatifida 0.40546511

Viola_americana 1.42138568

Viola_riviniana -0.69942773

This is “b” that the values for each species is completely different from above data:

> b

Viola_riviniana

0.31310231

Trifolium_repens

0.47677369

Artemisia_dracunculus

1.06745099

Artemisia_sacrorum

-0.04652002

Zea_mays

1.04949307

Bromus_inermis

1.33913581

Festuca_idahoensis

1.50407740

Agropyron_cristatum

2.84354046

Pinus_sylvestris

1.42138568

I am lost! Any suggestion for troubleshooting?

Thank you so much for your time and contributes.

Mina