An R-sig-phylo user asks:
Is there a way to detect and list all tips of a tree with 0 length in their terminal branches (essentially duplicated sequences)? Would also be great if it's possible to output them in groups so that it's clear which tips are identical to which.
I answered the first part as follows:
I would advise setting some "tolerance" value, below which you consider a terminal edge to be zero. Then try:
This works great for the first part - the only problem is that it ignores the second part by not telling us what tips are separated by tol or less distance from which other tips.
Here is a solution that addresses that:
tol<-1e-12 # say
Let's take the following tree:
Loading required package: ape
> tol<-1e-12 # say
> x<-apply(D,1,function(x) names(which(x<=tol)))
 "t9" "t11"
 "t10" "t12" "t13"
 "t3" "t14"
 "t6" "t17"
 "t1" "t18" "t19"
Cool - it works.