So I have a dataset which simplified looks something like this:
Year ID Sum
2009 999 100
2009 123 85
2009 666 100
2009 999 10
You can use dplyr, and the base function cumsum:
require(dplyr)
dataset %>%
group_by(Year, ID) %>%
mutate(cumsum = cumsum(Sum)) %>%
ungroup()
Another way
1) use ddply to sum a variable by group (similar to SQL group by)
X <- ddply ( dataset, .(Year,ID), sum)
2) merge the result with dataset
Y <- merge( dataset, X, by=('Year','ID')
Using data.table
:
require(data.table)
DT <- data.table(DF)
DT[, Cum.Sum := cumsum(Sum), by=list(Year, ID)]
Year ID Sum Cum.Sum
1: 2009 999 100 100
2: 2009 123 85 85
3: 2009 666 100 100
4: 2009 999 100 200
5: 2009 123 90 175
6: 2009 666 85 185
7: 2010 999 100 100
8: 2010 123 100 100
9: 2010 666 95 95
10: 2010 999 75 175
11: 2010 123 100 200
12: 2010 666 85 180