I have a data frame df
(which can be downloaded here) referred to a register of companies that looks something like this:
Provider.ID Lo
When you group by local.Authority & year it takes unique values and print the result as 1,-1,1 so better group by only local.Authority where cumsum works based on total values and result 1,0,1
df <- df %>%
group_by(Local.Authority) %>%
mutate(cum.to = cumsum(total))
> df
Source: local data frame [3 x 8]
Groups: Local.Authority [1]
Provider.ID Local.Authority month year entry exit total cum.to
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1-102642676 Kent 10 2010 1 0 1 1
2 1-102642676 Kent 9 2011 0 1 -1 0
3 1-102642676 Kent 10 2014 1 0 1 1
I got the solution to my problem. I restarted my session and I got my result grouping just by Local Authority and then arranging:
> df.1 = df %>% group_by(Local.Authority) %>%
+ mutate(cum.total = cumsum(total)) %>%
+ arrange(year, month, Local.Authority)
> df.1
Source: local data frame [41 x 8]
Groups: Local.Authority [36]
Provider.ID Local.Authority month year entry exit total cum.total
<fctr> <fctr> <int> <int> <int> <int> <int> <int>
1 1-102642676 Bexley 10 2010 1 0 1 1
2 1-102642676 Brent 10 2010 1 0 1 1
3 1-102642676 Bristol, City of 10 2010 1 0 1 1
4 1-102642676 Bury 10 2010 1 0 1 1
5 1-102642676 Cambridgeshire 10 2010 1 0 1 1
6 1-102642676 Cheshire East 10 2010 2 0 2 2
7 1-102642676 East Sussex 10 2010 5 0 5 5
8 1-102642676 Enfield 10 2010 1 0 1 1
9 1-102642676 Essex 10 2010 1 0 1 1
10 1-102642676 Hampshire 10 2010 1 0 1 1
Checking "Kent" now it yields the expected result:
> check = df.1 %>% filter(Local.Authority == "Kent")
> check
Source: local data frame [3 x 8]
Groups: Local.Authority [1]
Provider.ID Local.Authority month year entry exit total cum.total
<fctr> <fctr> <int> <int> <int> <int> <int> <int>
1 1-102642676 Kent 10 2010 1 0 1 1
2 1-102642676 Kent 9 2011 0 1 -1 0
3 1-102642676 Kent 10 2014 1 0 1 1