*phylogenetic variance-covariance matrix*), but that included ancestral nodes. I pulled this out of an old function of mine, anc.trend(), so I figured I could do better - and after some tinkering I came up with a much improved version, as follows:

vcvPhylo2<-function(tree,anc.nodes=T){

n<-length(tree$tip.label)

h<-nodeHeights(tree)[order(tree$edge[,2]),2]

h<-c(h[1:n],0,h[(n+1):length(h)])

M<-mrca(tree,full=anc.nodes)[c(1:n,anc.nodes*(n+2:tree$Nnode)),c(1:n,anc.nodes*(n+2:tree$Nnode))]

C<-matrix(h[M],nrow(M),ncol(M))

if(anc.nodes) rownames(C)<-colnames(C)<-c(tree$tip.label,n+2:tree$Nnode)

else rownames(C)<-colnames(C)<-tree$tip.label

return(C)

}

A couple of tricks in here:

1) Rather than calling dist.nodes(), this function uses the phytools function nodeHeights, which returns a matrix of the dimensions of tree$edge, containing the height above the root of every node, and is quite fast.

2) The function then uses mrca(...,full=T) in the ape package, which returns the MRCA of every pair of species and nodes. This essentially serves as an index to construct our VCV matrix from the node heights!

All of this is done with the following lines of code:

# this gets a vector of height by node number

h<-nodeHeights(tree)[order(tree$edge[,2]),2]

# this inserts a height of zero for the root node (node n+1 for n species)

h<-c(h[1:n],0,h[(n+1):length(h)])

# this is our index matrix

M<-mrca(tree,full=T)

# this is our VCV matrix

C<-matrix(h[M],nrow(M),ncol(M))

Let's compare computation time to vcv.phylo and our earlier version (vcvPhylo1):

> tree<-rtree(500)

> system.time(X<-vcv.phylo(tree)) # only tips

user system elapsed

2.20 0.00 2.22

> system.time(Y<-vcvPhylo1(tree)) # tips & nodes, old version

user system elapsed

7.68 0.03 7.73

> system.time(Z<-vcvPhylo2(tree)) # tips & nodes, new version

user system elapsed

3.49 0.05 3.54

> all(Y==Z)

[1] TRUE

> all(X==Z[c(1:500),c(1:500)])

[1] TRUE

Cool! BTW - Happy Thanksgiving to all the blog readers.

## No comments:

## Post a Comment