R: Round down dates to first day of the week

前端 未结 3 2032
抹茶落季
抹茶落季 2021-01-04 20:37

I have a dataframe where one of the columns contains dates (some dates appear multiple times). I want to aggregate the dates by week. The best way I can think of this is to

相关标签:
3条回答
  • 2021-01-04 21:12

    With lubridate you could try this:

    library(lubridate)
    dates <- seq.Date(as.Date("2016-04-04"), as.Date("2016-04-14"), by = 1)
    floor_date(dates - 1, "weeks") + 1
    

    floor_date starts weeks on Sundays, so to avoid those being included in the next week you have to subtract one before rounding and then increase the value by one day.

    0 讨论(0)
  • 2021-01-04 21:23

    cut() from base R has two methods for objects of class Date and POSIXt which assume that weeks start on Monday by default (but may be changed to Sunday using start.on.monday = FALSE).

    dates <- c("2016-04-04", "2016-04-05", "2016-04-06", "2016-04-07", "2016-04-08", 
               "2016-04-09", "2016-04-10", "2016-04-11", "2016-04-12", "2016-04-13", 
               "2016-04-14")
    result <- data.frame(
      dates,
      cut_Date = cut(as.Date(dates), "week"),
      cut_POSIXt = cut(as.POSIXct(dates), "week"),
      stringsAsFactors = FALSE)
    
    result
    #        dates   cut_Date cut_POSIXt
    #1  2016-04-04 2016-04-04 2016-04-04
    #2  2016-04-05 2016-04-04 2016-04-04
    #3  2016-04-06 2016-04-04 2016-04-04
    #4  2016-04-07 2016-04-04 2016-04-04
    #5  2016-04-08 2016-04-04 2016-04-04
    #6  2016-04-09 2016-04-04 2016-04-04
    #7  2016-04-10 2016-04-04 2016-04-04
    #8  2016-04-11 2016-04-11 2016-04-11
    #9  2016-04-12 2016-04-11 2016-04-11
    #10 2016-04-13 2016-04-11 2016-04-11
    #11 2016-04-14 2016-04-11 2016-04-11
    

    Note that cut() returns factors which is perfect for aggregation as requested by the OP:

    str(result)
    #'data.frame':  11 obs. of  3 variables:
    # $ dates     : chr  "2016-04-04" "2016-04-05" "2016-04-06" "2016-04-07" ...
    # $ cut_Date  : Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...
    # $ cut_POSIXt: Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...
    

    However, for plotting aggregated values with ggplot2 (and if there is a large number of weeks which might clutter the axis) it might be better to switch from a discrete time scale to a continuous time scale. Then it is necessary to coerce factors back to Date or POSIXct:

    as.Date(as.character(result$cut_Date))
    as.POSIXct(as.character(result$cut_Date))
    
    0 讨论(0)
  • 2021-01-04 21:29

    With the week_startparameter in the floor_date function of the lubridate package you have the option to specify the beginning of the week since lubridate version 1.7.0. This allows you to perform:

    library(lubridate)
    dates <- seq.Date(as.Date("2016-04-04"), as.Date("2016-04-14"), by = 1)
    floor_date(dates, "weeks", week_start = 1)
    

    I would post it as a comment to Sraffa's response but I don't have the reputation.

    0 讨论(0)
提交回复
热议问题