Rolling sums for groups with uneven time gaps

前端 未结 6 2504
误落风尘
误落风尘 2021-02-15 23:42

Here\'s the tweak to my previously posted question. Here\'s my data:

set.seed(3737)
DF2 = data.frame(user_id = c(rep(27, 7), rep(11, 7)),
            date = as.D         


        
6条回答
  •  独厮守ぢ
    2021-02-16 00:30

    Here is another idea with findInterval to minimize comparisons and operations. First define a function to accomodate the basic part ignoring the grouping. The following function computes the cumulative sum, and subtracts the cumulative sum at each position from the one at its respective past date:

    ff = function(date, value, minus)
    {
        cs = cumsum(value)  
        i = findInterval(date - minus, date, left.open = TRUE) 
        w = which(as.logical(i))
        i[w] = cs[i[w]]
        cs - i
    }
    

    And apply it by group:

    do.call(rbind, 
            lapply(split(DF2, DF2$user_id), 
                   function(x) data.frame(x, 
                             minus7 = ff(x$date, x$value, 7), 
                             minus14 = ff(x$date, x$value, 14))))
    #      user_id       date value minus7 minus14
    #11.8       11 2016-01-01   6.8    6.8     6.8
    #11.9       11 2016-01-03  21.3   28.1    28.1
    #11.10      11 2016-01-05  19.8   47.9    47.9
    #11.11      11 2016-01-07  22.0   69.9    69.9
    #11.12      11 2016-01-10  19.4   82.5    89.3
    #11.13      11 2016-01-14  17.5   58.9   106.8
    #11.14      11 2016-01-16  19.3   56.2   119.3
    #27.1       27 2016-01-01  15.0   15.0    15.0
    #27.2       27 2016-01-03  22.4   37.4    37.4
    #27.3       27 2016-01-05  13.3   50.7    50.7
    #27.4       27 2016-01-07  21.9   72.6    72.6
    #27.5       27 2016-01-10  20.6   78.2    93.2
    #27.6       27 2016-01-14  18.6   61.1   111.8
    #27.7       27 2016-01-16  16.4   55.6   113.2
    

    The above apply-by-group operation can, of course, be replaced by any method prefereable.

提交回复
热议问题