Saturday, December 3, 2011

Dropping the same tip from many trees in a "multiPhylo" object

Following the same theme as an earlier post, I thought I would take a closer look a Google search strings that have led readers to my site. A recent search string that has multiple hits is the following: "drop.tip multiphylo", by which I presume they mean to ask how one drops the same tip (or set of tips) from a set of phylogenies in an object of class "multiPhylo". It turns out that we can do this very easily, in fact.

First, let's get a set of trees as an object of the class "multiPhylo". [In most cases, of course, these would probably be trees we have read in from file (say, as a posterior sample of trees in a Bayesian analysis). In this case, we will just use the ape function rmtree().]

> require(ape)
> trees<-rmtree(N=10,n=20)
> trees
10 phylogenetic trees
> require(phytools)
> plotTree(trees[[6]]) # for instance




This gives a set of 10 trees, each containing 20 taxa, with the tip labels "t1" to "t20".

Now, a "multiPhylo" object is really just a list of trees, so to apply the ape function drop.tip to all the trees in the list, we can just use the function lapply. Let's drop, say "t13" (for luck) from all the trees in the list:

> ntrees<-lapply(trees,drop.tip,tip="t13")
> class(trees)<-"multiPhylo"
> plotTree(ntrees[[6]]) # for comparison




and taxon "t13" is gone, just as we'd hope. (The same is true, of course, of all the trees in our set.)

Naturally, the same technique works for a set of tips, for instance:

> ntrees<-lapply(trees,drop.tip,tip=c("t11","t12","t13"))
> class(ntrees)<-"multiPhylo"


We can even combine this with the trick, described in an earlier post that allows us to drop all but the set of taxa we have in a list. In this case, we will make the special function keep.tip. For instance if we wanted to keep only "t1" through "t10" in each of our trees above, we could just do the following:

> species<-paste("t",1:10,sep="")
> species
[1] "t1" "t2" "t3" "t4" "t5" "t6" "t7" "t8" "t9" "t10"
> keep.tip<-function(tree,tip) drop.tip(tree,setdiff(tree$tip.label,tip))
> ntrees<-lapply(trees,keep.tip,tip=species)
> "multiPhylo"->class(ntrees)
> plotTree(ntrees[[6]])




[Thanks to Dan Rabosky for the setdiff() suggestion.] Cool.

8 comments:

  1. Dear Liam,

    First, thanks for your very helpful blog. Here is my problem :

    After dropping tips from 200 Beast trees in nexus format, I wanted to rename the tips but ntrees$tip.label returns an empty object. I realized that the new multiPhylo object with pruned trees had lost its original structure. I don't know if this is the usual output of drop.tip with the lapply function or if I have done something wrong.

    Is it possible to keep the nexus format of my original multiPhylo object when dropping tips or do I have to re-write a new nexus file?

    Best regards,

    Oriane

    ReplyDelete
  2. Dear Liam,

    thank you for that very useful tip. However, I was wondering which command(s) I should use if I want to insert a tip at the same position into a set of trees? I am working on the cophylogenetics of a plant genus and its insect pollinator and I would like to insert for each insect pollinator species as many tips into the pollinator tree as many host species it visits. Since I also want to account for phylogenetic incertainity I would like to do that in a set of trees sampled from the posterior distribution. I would highly appreciate your input! Thank you very much!

    Best wishes,
    Belinda

    ReplyDelete
  3. Hi Liam,

    I've been following some of your advices to work with an object of the class Multiphylo. I read it initially and I could create the object which contain 1000 trees, nevertheless I've been trying to read again the same file and now I'm getting this error: Error in if (all(phy$node.label == "")) phy$node.label <- NULL :
    missing value where TRUE/FALSE needed

    I'm not sure if the R version is the problem, I'm using the last version 3.6.0. What confuses me is that I could read the file one day, and the next day without changing the script that error was the only thing I got. I would appreciate any insights about what I'm doing wrong.
    Thank you!

    Nataly

    ReplyDelete
    Replies
    1. Dear Nataly. I'm not sure what the problem is. One possibility would be to try the following code:

      trees<-phytools::untangle(trees,"read.tree")

      & then run it again. Let me know if that solves the issue.

      All the best, Liam

      Delete
    2. Hi Liam,

      Is this solution to apply after I have created the MultiPhylo object? 'cause the main issue is that I cannot create the object in R when I use: supertree<-read.tree("Supertree.tre")
      what I got is: Error in if (all(phy$node.label == "")) phy$node.label <- NULL :
      missing value where TRUE/FALSE needed

      I have checked and downloaded the file several times, re-started Rstudio but I still got the same Error.
      I tried your script but it asked for the multiPhylo object, is that right?
      Cheers,

      Nataly

      Delete
  4. Dear Liam,

    I'm trying to read a posterior sample of Bayesian trees (n = 1000) to use them later with the function ltt95. However, I cannot even plot them, getting following error: Error: $ operator not defined for this S4 class.

    Here is my code:

    > trees <- read.beast("final_wo3rd_bmodel_burnin.trees")
    > trees
    1000 beast trees
    > class(trees)<-"multiPhylo"
    > trees
    1000 phylogenetic trees
    > plotTree(trees[[6]])
    Error: $ operator not defined for this S4 class


    Cheers Josef

    ReplyDelete
    Replies
    1. Hi Josef. Not sure. I haven't used read.beast before. Have you tried to just read in the trees using read.nexus? You'll lose the BEAST annotation, but this would tell you if the problem is with phytools or with ips. - Liam

      Delete
    2. Hi Liam,

      you made my day. 'read.nexus' works perfectly. Many thanks.

      Cheers Josef

      Delete

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.