Sunday, June 5, 2022

Follow-up on the sensitivity of Blomberg's K to terminal edge lengths of the tree

Yesterday I documented in a post on this blog how a seemingly very small change to the edge lengths of our tree could have a very large effect on our measurement of phylogenetic signal using Blomberg et al.'s K statistic.

Because I used a very small tree (ten tips) in the example, a Twitter follower, Gustavo Paterno, asked the following, very natural question: [Does] the issue become less sensitive with increasing sample size?

Unfortunately, the answer is an emphatic NO. Let's see.

Here I'll simulate a tree with 100 taxa, I'll simulate a trait on that tree under Brownian motion, and then, using code that's similar to what I posted yesterday, I'll shorten all the tips of the tree by slightly less than half the distance between the two most closely related sister taxa.

First, here are my trees:

library(phytools)
t1<-pbtree(n=100)
## compute patristic distance matrix
D<-as.dist(cophenetic(t1))
## find distance to cut
slice<-min(D)/2-1e-12
t2<-treeSlice(t1,max(nodeHeights(t1))-slice,
    orientation="rootwards")
par(mfrow=c(1,2))
plotTree(t1,lwd=1,fsize=0.5)
plotTree(t2,lwd=1,fsize=0.5)

plot of chunk unnamed-chunk-2

Simulate a single character on tree1.

x<-fastBM(t1)
head(x)
##                t18                t19                t75                t76 
##  1.705892864344912  1.866598869750947 -0.197770768667297 -0.692227700756325 
##                t38                 t1 
## -0.198408749443290 -0.108875374497454

Now compute phylogenetic signal using Blomberg et al.'s K on each of the two trees.

phylosig(t1,x,test=TRUE)
## 
## Phylogenetic signal K : 1.08337 
## P-value (based on 1000 randomizations) : 0.001
phylosig(t2,x,test=TRUE)
## 
## Phylogenetic signal K : 0.0000537938 
## P-value (based on 1000 randomizations) : 0.014

As before, we see that λ is pretty much unaffected.

phylosig(t1,x,method="lambda",test=TRUE)
## 
## Phylogenetic signal lambda : 1.0005 
## logL(lambda) : -123.238 
## LR(lambda=0) : 161.014 
## P-value (based on LR test) : 6.79424e-37
phylosig(t2,x,method="lambda",test=TRUE)
## 
## Phylogenetic signal lambda : 0.999827 
## logL(lambda) : -123.238 
## LR(lambda=0) : 161.014 
## P-value (based on LR test) : 6.79424e-37

One trick that could help us understand this apparent contradiction a bit better is to see what happens when we transform our second tree by the MLE of λ for that tree, and then compute K on the transformed tree.

Let's check.

lambda2<-phylosig(t2,x,method="lambda")$lambda
phylosig(phytools:::lambdaTree(t2,lambda2),x)
## 
## Phylogenetic signal K : 1.06701

What? Ha.

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.