I have a dataset like the following, where \"group\" is a group variable. I want to count the number of \'next\' days by group, but if it is not the next day I want the count to
An option is to group by 'group', then use diff
on the Date
class convered 'date', create a logical vector and use cumsum
to replicate the results in 'want' ('wantn') and then with the 'wantn', apply max
on it
library(dplyr)
library(data.table)
df %>%
group_by(group) %>%
mutate(wantn = rowid(cumsum(c(TRUE, diff(as.Date(date)) !=1))),
want2n = max(wantn))
# A tibble: 7 x 6
# Groups: group [2]
# group date want want2 wantn want2n
#
#1 1 2000-01-01 1 3 1 3
#2 1 2000-01-03 1 3 1 3
#3 1 2000-01-04 2 3 2 3
#4 1 2000-01-05 3 3 3 3
#5 2 2000-01-09 1 2 1 2
#6 2 2000-01-10 2 2 2 2
#7 2 2000-01-12 1 2 1 2
or if we want to not use rowid
, then create the grouping variable with cumsum
and get the sequence
df %>%
group_by(group) %>%
group_by(group2 = cumsum(c(TRUE, diff(as.Date(date)) !=1)), add = TRUE) %>%
mutate(wantn = row_number()) %>%
group_by(group) %>%
mutate(want2n = max(wantn)) %>%
select(-group2)