Wednesday, December 28, 2011

New version of phylosig, and new functions to match data & tree

In response to a user bug report, I have now just posted a new version of the function phylosig() online. Direct link to code is here, and this will be added to the next version of phytools.

The user reported two issues:

1) If the names attribute was not assigned for the vector of standard errors, se, the method returned an error. This was not true for the data vector, x, in which phylosig just assumed the same order as tree$tip.label for is.null(names(x))==TRUE.

2) The user also reported that the function failed for trees containing polytomies, but only in the case in which standard errors for the the trait means are provided.

First, 2). I realized quickly that this was because the function had been using pic() to get a starting value for the evolutionary rate, σ2 during optimization. This was easy enough to solve just by changing to pic(multi2di(tree),...).

Second, 1). I was using different error control for the standard errors and the mean vector. What I have done to resolve this problem is created two new functions that match up the names of an arbitrary vector and the tip labels of a tree. I also replaced some of my earlier code with set functions setdiff and intersect.

The function to match a data vector to the tree is here:

# written by Liam J. Revell 2011
matchDatatoTree<-function(tree,x,name){
   if(is.matrix(x)) x<-x[,1]
   if(is.null(names(x))){
      if(length(x)==length(tree$tip)){
         print(paste(name,"has no names; assuming",name,
         "is in the same order as tree$tip.label"))
         names(x)<-tree$tip.label
      } else
         stop(paste(name,
         "has no names and is a different length than tree$tip.label"))
   }
   if(any(is.na(match(names(x),tree$tip.label)))){
      print(paste("some species in",name,
      "are missing from tree, dropping missing taxa from",name))
      x<-x[intersect(tree$tip.label,names(x))]
   }
   return(x)
}


Whereas the function to match the tree to a data vector is here:

# written by Liam J. Revell 2011
matchTreetoData<-function(tree,x,name){
   if(any(is.na(match(tree$tip.label,names(x))))){
      print(paste("some species in tree are missing from",name,
      ", dropping missing taxa from the tree"))
      tree<-drop.tip(tree,setdiff(tree$tip.label,names(x)))
   }
   if(any(is.na(x))){
      print(paste("some data in",name,
      "given as 'NA', dropping corresponding species from tree"))
      tree<-drop.tip(tree,names(which(is.na(x))))
   }
   return(tree)
}


Cool.

No comments:

Post a Comment

Note: due to the very large amount of spam, all comments are now automatically submitted for moderation.