Friday, June 29, 2012

R trick 1: get the frequencies of factors in a vector

Here's a quick R hint. (I had briefly forgotten how to do this, and the solution wasn't totally obvious online.) Say I have a vector of factors in memory in R and I want to get the frequency or relative frequency of the different levels of the factor, I can do this using the base generic function summary. Just to see how this works, consider a vector containing the best-fitting quantitative trait evolution model for a set of 100 trees:

> best.fit
 [1] BM     BM     OU     lambda BM     BM     OU     BM
 [9] BM     BM     BM     lambda lambda BM     OU     BM
[17] BM     BM     OU     BM     BM     BM     lambda BM
[25] BM     BM     BM     OU     lambda BM     BM     BM
[33] BM     OU     BM     BM     lambda lambda lambda BM
[41] BM     BM     BM     OU     BM     BM     BM     BM
[49] BM     OU     BM     BM     BM     BM     BM     BM
[57] lambda lambda OU     OU     BM     BM     lambda BM
[65] BM     BM     BM     BM     lambda BM     BM     BM
[73] BM     OU     OU     BM     lambda BM     lambda BM
[81] BM     lambda BM     BM     BM     OU     BM     BM
[89] BM     BM     OU     OU     lambda BM     BM     BM
[97] BM     BM     OU     BM
Levels: BM lambda OU


We can count up the number or relative frequency of trees with each best fit model as follows:

> summary(best.fit)
   BM lambda     OU
   68     16     16
> summary(best.fit)/sum(summary(best.fit))
   BM lambda     OU
 0.68   0.16   0.16


That's it.

2 comments: