Rolling sums for groups with uneven time gaps

前端 未结 6 1359
说谎
说谎 2021-02-16 00:19

Here\'s the tweak to my previously posted question. Here\'s my data:

set.seed(3737)
DF2 = data.frame(user_id = c(rep(27, 7), rep(11, 7)),
            date = as.D         


        
6条回答
  •  隐瞒了意图╮
    2021-02-16 00:45

    logic : first group by user_id, followed by date. Now for each subset of data, we are checking which all dates lie between the current date and 7/14 days back using between() which returns a logical vector.

    Based on this logical vector I add the value column

    library(data.table)
    setDT(DF2)[, `:=`(v_minus7 = sum(DF2$value[DF2$user_id == user_id][between(DF2$date[DF2$user_id == user_id], date-7, date, incbounds = TRUE)]), 
                     v_minus14 = sum(DF2$value[DF2$user_id == user_id][between(DF2$date[DF2$user_id == user_id], date-14, date, incbounds = TRUE)])),
               by = c("user_id", "date")][]
     #   user_id       date value v_minus7 v_minus14
     #1:      27 2016-01-01  15.0     15.0      15.0
     #2:      27 2016-01-03  22.4     37.4      37.4
     #3:      27 2016-01-05  13.3     50.7      50.7
     #4:      27 2016-01-07  21.9     72.6      72.6
     #5:      27 2016-01-10  20.6     78.2      93.2
     #6:      27 2016-01-14  18.6     61.1     111.8
     #7:      27 2016-01-16  16.4     55.6     113.2
     #8:      11 2016-01-01   6.8      6.8       6.8
     #9:      11 2016-01-03  21.3     28.1      28.1
    #10:      11 2016-01-05  19.8     47.9      47.9
    #11:      11 2016-01-07  22.0     69.9      69.9
    #12:      11 2016-01-10  19.4     82.5      89.3
    #13:      11 2016-01-14  17.5     58.9     106.8
    #14:      11 2016-01-16  19.3     56.2     119.3
    

    # from alexis_laz answer.
    ff = function(date, value, minus){
      cs = cumsum(value)  
      i = findInterval(date - minus, date, rightmost.closed = TRUE) 
      w = which(as.logical(i))
      i[w] = cs[i[w]]
      cs - i
    } 
    setDT(DF2)
    DF2[, `:=`( v_minus7 = ff(date, value, 7), 
                v_minus14 = ff(date, value, 14)), by = c("user_id")]
    

提交回复
热议问题