发表新帖

发表新帖

Fastest way to count occurrences of each unique element

前端未结

关注

 2  706

What is the fastest way to compute the number of occurrences for each unique element in a vector in R?

So far, I\'ve tried the following five functions:

相关标签:

2条回答

醉梦人生

2020-12-05 20:02
There's almost nothing that will beat tabulate() provided you can meet the initial conditions.
```
x <- sample(1:100, size=1e7, TRUE)
system.time(tabulate(x))
#  user  system elapsed 
# 0.071   0.000   0.072 
```
@dickoa adds a few more notes in the comments as to how to get the appropriate output, but tabulate as a workhorse function is the way to go.
0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2020-12-05 20:11
This is a little slower than tabulate, but is more universal (it will work with characters, factors, basically whatever you throw at it) and much easier to read/maintain/expand.
```
library(data.table)

f6 = function(x) {
  data.table(x)[, .N, keyby = x]
}

x <- sample(1:1000, size=1e7, TRUE)
system.time(f6(x))
#   user  system elapsed 
#   0.80    0.07    0.86 

system.time(f8(x)) # tabulate + dickoa's conversion to data.frame
#   user  system elapsed 
#   0.56    0.04    0.60 
```
UPDATE: As of data.table version 1.9.3, the data.table version is actually about 2x faster than tabulate + data.frame conversion.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题