I want to count and aggregate(sum) a column in a data.table
, and couldn\'t find the most efficient way to do this. This seems to be close to what I want R summariz
The post you are referring to gives a method on how to apply one aggregation method to several columns. If you want to apply different aggregation methods to different columns, you can do:
dat[, .(count = .N, var = sum(VAR)), by = MNTH]
this results in:
MNTH count var 1: 201501 4 2 2: 201502 3 0 3: 201503 5 2 4: 201504 4 2
You can also add these values to your existing dataset by updating your dataset by reference:
dat[, `:=` (count = .N, var = sum(VAR)), by = MNTH]
this results in:
> dat MNTH VAR count var 1: 201501 1 4 2 2: 201501 1 4 2 3: 201501 0 4 2 4: 201501 0 4 2 5: 201502 0 3 0 6: 201502 0 3 0 7: 201502 0 3 0 8: 201503 0 5 2 9: 201503 0 5 2 10: 201503 1 5 2 11: 201503 1 5 2 12: 201503 0 5 2 13: 201504 1 4 2 14: 201504 0 4 2 15: 201504 1 4 2 16: 201504 0 4 2
For further reading about how to use data.table syntax, see the Getting started guides on the GitHub wiki.