Thursday, January 31, 2013

Note on polytomies and internal branches of zero length in ancestral character estimation

Just a quick note on the use of polytomous trees vs. arbitrarily fully-resolved trees with branches of zero length in phytools functions for ancestral character estimation such as anc.Bayes, anc.ML, anc.trend, and ancThresh (as well as other actions that call these functions internally). Basically, and confusingly unlike functions such as pic in ape, polytomous trees (not internal branches of zero length) should be used. This is because of one specific internal calculation in these functions involving the inversion of the among tip species and internal node variance-covariance matrix computed by the phytools function vcvPhylo. If polytomies are represented using internal branches of zero length then vcvPhylo(tree) returns a singular matrix, which cannot be inverted. By contrast, if polytomies are represented as polytomies, vcvPhylo(tree) is not singular.

Here's an example:
> # create a tree with a polytomy
> tree<-read.tree(text="(A:2,(B:1,C:1,D:1):1);")
> plotTree(tree,node.numbers=TRUE)
> # compute among species & node VCV matrix
> C<-vcvPhylo(tree)
> C
 A B C D 6
A 2 0 0 0 0
B 0 2 1 1 1
C 0 1 2 1 1
D 0 1 1 2 1
6 0 1 1 1 1
> # invert?
> invC<-solve(C)
> invC # no problem
   A  B             C             D  6
A 0.5  0  0.000000e+00  0.000000e+00  0
B 0.0  1  0.000000e+00 -7.401487e-17 -1
C 0.0  0  1.000000e+00 -7.401487e-17 -1
D 0.0  0  4.163336e-17  1.000000e+00 -1
6 0.0 -1 -1.000000e+00 -1.000000e+00  4
> # resolve all nodes
> tree<-multi2di(tree)
> plotTree(tree,node.numbers=TRUE)
> # compute among species & node VCV matrix
> C<-vcvPhylo(tree)
> C
 A B C D 6 7
A 2 0 0 0 0 0
B 0 2 1 1 1 1
C 0 1 2 1 1 1
D 0 1 1 2 1 1
6 0 1 1 1 1 1
7 0 1 1 1 1 1
> # invert?
> invC<-solve(C)
Error in solve.default(C) :
Lapack routine dgesv: system is exactly singular: U[6,6]=0


It would seem to make sense to just include an internal step in all of these function in which we use di2multi to collapse any branches of zero length - the problem with this idea is that the consequence would be a disassociation between the node numbers of the original tree and the node numbers for which ancestral character estimates are returned. Yikes! In future versions of these functions I will try to remember to have them spit a meaningful error if a tree with branches of zero length is input - rather than just collapsing as at the present time.

7 comments:

  1. Yeah, zero length branches are a headache I have to deal with a lot. In paleo, the way we time-scale cladograms of fossil taxa often produce terminal branches of zero length.

    ReplyDelete
    Replies
    1. Hi David. Can you elaborate? Why is that the case?

      Zero length terminal branches are also relatively common in molecular phylogenetic studies. Branches of zero length, of course, cannot have experienced evolution, so they can cause significant problems for phenotypic studies. However, it is usually the case terminal zero length branches in empirical studies are not really zero in length - they are just incorrect (underestimated). The correct way to deal with these branches is unclear.

      - Liam

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. Hi Liam, can you tell me how to resolve polytomies e.g., using 'multi2di' and at the same time add minimal branch lengths to the resolved internal branches?
      Thanks for your help!

      Delete
    2. By coincidence, I just did this here. Keep in mind that after doing this your tree may not be (strictly) ultrametric). Let me know if this is what you were thinking of.

      Delete
    3. Thank you - this is very useful.

      Delete
  3. I wonder if it there is a way to identify polytomies in terminal branches and delete these branches (prune the tree)

    ReplyDelete