Yesterday I documented in a post on this blog how a seemingly very small change to the edge lengths of our tree could have a very large effect on our measurement of phylogenetic signal using Blomberg et al.'s K statistic.
Because I used a very small tree (ten tips) in the example, a Twitter follower, Gustavo Paterno, asked the following, very natural question: [Does] the issue become less sensitive with increasing sample size?
Thanks. Would never guess this extreme behaviour. The issue becomes less sensitive with increasing sample size?
— Gustavo Paterno (@paternogb) June 5, 2022
Unfortunately, the answer is an emphatic NO. Let's see.
Here I'll simulate a tree with 100 taxa, I'll simulate a trait on that tree under Brownian motion, and then, using code that's similar to what I posted yesterday, I'll shorten all the tips of the tree by slightly less than half the distance between the two most closely related sister taxa.
First, here are my trees:
library(phytools)
t1<-pbtree(n=100)
## compute patristic distance matrix
D<-as.dist(cophenetic(t1))
## find distance to cut
slice<-min(D)/2-1e-12
t2<-treeSlice(t1,max(nodeHeights(t1))-slice,
orientation="rootwards")
par(mfrow=c(1,2))
plotTree(t1,lwd=1,fsize=0.5)
plotTree(t2,lwd=1,fsize=0.5)
Simulate a single character on tree1
.
x<-fastBM(t1)
head(x)
## t18 t19 t75 t76
## 1.705892864344912 1.866598869750947 -0.197770768667297 -0.692227700756325
## t38 t1
## -0.198408749443290 -0.108875374497454
Now compute phylogenetic signal using Blomberg et al.'s K on each of the two trees.
phylosig(t1,x,test=TRUE)
##
## Phylogenetic signal K : 1.08337
## P-value (based on 1000 randomizations) : 0.001
phylosig(t2,x,test=TRUE)
##
## Phylogenetic signal K : 0.0000537938
## P-value (based on 1000 randomizations) : 0.014
As before, we see that λ is pretty much unaffected.
phylosig(t1,x,method="lambda",test=TRUE)
##
## Phylogenetic signal lambda : 1.0005
## logL(lambda) : -123.238
## LR(lambda=0) : 161.014
## P-value (based on LR test) : 6.79424e-37
phylosig(t2,x,method="lambda",test=TRUE)
##
## Phylogenetic signal lambda : 0.999827
## logL(lambda) : -123.238
## LR(lambda=0) : 161.014
## P-value (based on LR test) : 6.79424e-37
One trick that could help us understand this apparent contradiction a bit better is to see what happens when we transform our second tree by the MLE of λ for that tree, and then compute K on the transformed tree.
Let's check.
lambda2<-phylosig(t2,x,method="lambda")$lambda
phylosig(phytools:::lambdaTree(t2,lambda2),x)
##
## Phylogenetic signal K : 1.06701
What? Ha.
Hello,
ReplyDeleteI am finding the same issue in my analyses (1414 species, tree made with V.PhyloMaker2 i.e. pruned from megatree built based on molecular data), ex. lambda = 0.819191 and K = 0.0520079 for the same trait and tree, with both having p <<< 0.05. However, once I transform the tree by its MLE of lambda, K = 0.699543, similar to the result at the end.
I presume K from the adjusted tree be a more valid representation of the true value, but I have trouble understanding what running a phylogenetic signal analysis (K) on a phylogenetic signal - adjusted tree (MLE of lambda) really means.
Hi,
ReplyDeleteI am finding the same thing in my data (1414 species, phylogeny from V.PhyloMaker2 - pruned from megatree built based on molecular data).
on the untransformed yet ultrametric tree, lambda = 0.82 and K = 0.05 for the same trait, both with p <<0.05.
When I adjust the tree by the MLE of lambda as shown above, K is now 0.70.
I presume the adjusted K to be a more accurate representation of the true K value, however I am having trouble understanding what this represents. What does running a phylogenetic signal analysis (K) on a phylogenetic signal - adjusted (lambda) tree mean?