cumsum by group

南笙酒味 提交于 2019-12-17 07:48:51

问题


Suppose data looks like

group1 group2 num
A      sg     1
A      sh     2
A      sg     4
B      at     3
B      al     7

a <- cumsum(data[,"num"]) # 1 3 7 10 17

I need something accumulated by groups. In reality,I have multiple columns as grouping indicators. I want to get the accumulated sum by the subgroup I define.

E.g

If I group by group1 only, then the output should be

group1 sum
A      1
A      3
A      7
B      3
B      10

If I group by two variables group1,group2 then the output is

group1 group2 sum
A      sg     1
A      sh     2
A      sg     5
B      at     3
B      al     7

回答1:


library(data.table)

data <- data.table(group1=c('A','A','A','B','B'),sum=c(1,2,4,3,7))

data[,list(cumsum = cumsum(sum)),by=list(group1)]



回答2:


In addition to using data.table, tapply in base R works fine for both of these cases:

dta <- read.table(text="
group1 group2 num
A      sg     1
A      sh     2
A      sg     4
B      at     3
B      al     7", header=TRUE)

dta$cumsum <- do.call(c, tapply(dta$num, dta$group1, FUN=cumsum))

Calculating the cumulative sum by two groups requires some reordering:

dta <- dta[order(dta$group1, dta$group2, dta$num),]

dta$cumsum2 <- do.call(c, tapply(dta$num, 
                                 paste0(dta$group1, dta$group2), 
                                 FUN=cumsum))
dta
      group1 group2 num cumsum cumsum2
1      A     sg   1      1       1
3      A     sg   4      7       5
2      A     sh   2      3       2
5      B     al   7     10       7
4      B     at   3      3       3

And if you need the original order back:

dta[as.numeric(rownames(dta)),]
  group1 group2 num cumsum cumsum2
1      A     sg   1      1       1
2      A     sh   2      3       2
3      A     sg   4      7       5
4      B     at   3      3       3
5      B     al   7     10       7


来源:https://stackoverflow.com/questions/30277087/cumsum-by-group

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!