Wednesday, December 7, 2011

Dropping a random tip (or set of tips) from a tree

In a couple of recent posts (1,2) I have commented on Google search strings that (as the administrator of this blog) I can see have led users to my site. Today, a user ended up visiting the blog due to the following search string:

"randomly drop.tip in r"

by which I take it to mean they'd like to drop a random tip (or set of tips) from a "phylo" object in R.

Here's how to do it. To drop a single tip at random just type:

tr2<-drop.tip(tr1,tr1$tip.label[ceiling(runif(n=1, max=length(tr1$tip)))])

To drop a set of tips, say m=10 random tips, just type:

tr2<-drop.tip(tr1,sample(tr1$tip.label)[1:m])

(actually, this will also work for a single tip, you just need to set m=1). Of course, if you wanted to, say, drop 30% of the taxa at random, you would just have to first compute the number of taxa to drop, m, and then apply the code above. For instance:

m<-round(0.3*length(tr1$tip))
tr2<-drop.tip(tr1,sample(tr1$tip.label)[1:m])


It's that easy!

6 comments:

  1. Hi Liam, Do you have a code for how to drop a number of tips with a certain tip.state randomly from a tree? (so not all tips with this state, but just a certain number)? Would be great, thanks, Renske

    ReplyDelete
    Replies
    1. Hi Renske.

      Do you want to drop a fixed number, or any tip in state A with a certain probability? (These will give slightly different results.) Also, how are your tip states stored?

      Say, you have a character vector, x, with names containing the tip states "A", "B", and "C" and you want to drop 10 tips at random that are in state "B", you could do:

      tips<-sample(rownames(x[x=="B"]),size=10)
      tt<-drop.tip(tree,tips)

      By contrast, to drop tips in state "B" with probability 0.1 (for instance), you would do:

      tips<-rownames(x[x=="B"])
      tips[runif(n=length(tips))<0.1]->tips
      tt<-drop.tip(tree,tips)

      I can be more specific if your data is configured differently.

      Good luck. Liam

      Delete
    2. Cool thanks! I edited it a bit, the following works as well:

      sampling.f0<-b0 #put here the number of species in state AB you want to drop
      sampling.f1<-b1 #put here the number of species in state A you want to drop
      sampling.f2<-b2 #put here the number of species in state B you want to drop

      drop<-c(drop0<-sample(phy$tip.label[phy$tip.state=="0"],sampling.f0), drop1<-sample(phy$tip.label[phy$tip.state=="1"],sampling.f1), drop2<-sample(phy$tip.label[phy$tip.state=="2"],sampling.f2))
      tree <- drop.tip(phy, drop)

      Delete
  2. Hi Liam,
    Does this work for multiphylo as well?
    m<-round(0.3*length(tr1$tip))
    tr2<-drop.tip(tr1,sample(tr1$tip.label)[1:m])


    ReplyDelete