Grouping data in weeks using r

问题

I have a CVS file which has data for different countries at different weeks of this year. I want to create a summary dataframe (within r) grouping together data for weeks 21-24 and weeks 37-41. The data is set as attached example:

I am a beginner and not sure where to start. Thanks

回答1:

We can use case_when to construct a grouping column based on the substring in the 'year_week' as well as do the grouping on 'country' and summarise the sum of 'new_cases`

library(dplyr)
library(stringr)
df1 %>%
   group_by(country, grp = case_when(as.numeric(str_remove(year_week,
              ".*-W")) %in% 21:24 ~ 'W21_W24', TRUE ~ 'W37_W41')) %>%
   summarise(new_cases = sum(new_cases, na.rm = TRUE), .groups = 'drop')

-output

# A tibble: 6 x 3
#  country  grp     new_cases
#  <chr>    <chr>       <dbl>
#1 Austria  W21_W24       874
#2 Austria  W37_W41     19045
#3 Belgium  W21_W24      4231
#4 Belgium  W37_W41     80918
#5 Bulgaria W21_W24       555
#6 Bulgaria W37_W41      6917

data

df1 <- structure(list(country = c("Austria", "Austria", "Austria", "Austria", 
"Austria", "Austria", "Austria", "Austria", "Belgium", "Belgium", 
"Belgium", "Belgium", "Belgium", "Belgium", "Belgium", "Belgium", 
"Belgium", "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria", "Bulgaria", 
"Bulgaria"), country_code = c("AT", "AT", "AT", "AT", "AT", "AT", 
"AT", "AT", "BE", "BE", "BE", "BE", "BE", "BE", "BE", "BE", "BE", 
"BG", "BG", "BG", "BG", "BG", "BG"), year_week = c("2020-W21", 
"2020-W22", "2020-W23", "2020-W24", "2020-W37", "2020-W38", "2020-W39", 
"2020-W40", "2020-W21", "2020-W22", "2020-W23", "2020-W24", "2020-W37", 
"2020-W38", "2020-W39", "2020-W40", "2020-W41", "2020-W24", "2020-W37", 
"2020-W38", "2020-W39", "2020-W40", "2020-W41"), new_cases = c(267, 
231, 184, 192, 3977, 4997, 4992, 5079, 1516, 1170, 843, 702, 
6012, 9947, 11192, 18368, 35399, 555, 937, 928, 1178, 1521, 2353
)), class = "data.frame", row.names = c(NA, -23L))

来源：https://stackoverflow.com/questions/65159760/grouping-data-in-weeks-using-r

标签

grouping