问题
This seems like it should be easy, but I've never been able to figure out how to do it. Using data.table
I want to sum a column, C
, by another column A
, and just keep those two columns. At the same time, I want to be able to name the new column. My attempts and desired output:
library(data.table)
dt <- data.table(A= c('a', 'b', 'b', 'c', 'c'), B=c('19', '20', '21', '22', '23'),
C=c(150,250,20,220,130))
# Desired Output - is there a way to do this in one step using data.table? #
new.data <- dt[, sum(C), by=A]
setnames(new.data,'V1', 'C.total')
new.data
A C.total
1: a 150
2: b 270
3: c 350
# Attempt 1: Problem is that columns B and C kept, extra rows kept #
new.data <- dt[, 'C.total' := sum(C), by=A]
new.data
A B C C.total
1: a 19 150 150
2: b 20 250 270
3: b 21 20 270
4: c 22 220 350
5: c 23 130 350
# Attempt 2: Problem is that new column not named #
new.data <- dt[, sum(C), by=A]
new.data
A V1
1: a 150
2: b 270
3: c 350
回答1:
Use list
(or .
):
> dt[, list(C.total = sum(C)), by=A]
A C.total
1: a 150
2: b 270
3: c 350
来源:https://stackoverflow.com/questions/32124630/data-table-group-by-sum-name-new-column-and-slice-columns-in-one-step