How do I sum the values of columns in several tables if tables have different lengths?

后端未结

关注

 3  362

离开以前

Alright, this should be an easy one but I\'m looking for a solution that\'s as fast as possible.

Let\'s say I have 3 tables (the number of tables will be much larger

相关标签:

3条回答

天涯浪人

2021-01-18 00:17

you can try this

df <- rbind(as.matrix(tab1), as.matrix(tab2), as.matrix(tab3))
aggregate(df, by=list(row.names(df)), FUN=sum)
  Group.1 V1
1       1  7
2       2  3
3       3  4
4       4  3
5       5  1

0 讨论(0)

离开以前

2021-01-18 00:33
We concatenate (c) the tab output to create 'v1', use tapply to get the sum of the elements grouped by the names of that object.
```
v1 <- c(tab1, tab2, tab3)
tapply(v1, names(v1), FUN=sum)
#1 2 3 4 5 
#7 3 4 3 1 
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

遇见更好的自我

2021-01-18 00:34

You could use rowsum(). The output will be slightly different than what you show, but you can always restructure it after the calculations. rowsum() is known to be very efficient.

x <- c(tab1, tab2, tab3)
rowsum(x, names(x))
#   [,1]
# 1    7
# 2    3
# 3    4
# 4    3
# 5    1

Here's a benchmark with akrun's data.table suggestion added in as well.

library(microbenchmark)
library(data.table)

xx <- rep(x, 1e5)

microbenchmark(
    tapply = tapply(xx, names(xx), FUN=sum),
    rowsum = rowsum(xx, names(xx)),
    data.table = data.table(xx, names(xx))[, sum(xx), by = V2]
)
# Unit: milliseconds
#        expr       min        lq      mean    median        uq       max neval
#      tapply 150.47532 154.80200 176.22410 159.02577 204.22043 233.34346   100
#      rowsum  41.28635  41.65162  51.85777  43.33885  45.43370 109.91777   100
#  data.table  21.39438  24.73580  35.53500  27.56778  31.93182  92.74386   100

0 讨论(0)