Rolling sums for groups with uneven time gaps

前端 未结 6 1362
说谎
说谎 2021-02-16 00:19

Here\'s the tweak to my previously posted question. Here\'s my data:

set.seed(3737)
DF2 = data.frame(user_id = c(rep(27, 7), rep(11, 7)),
            date = as.D         


        
6条回答
  •  长情又很酷
    2021-02-16 00:38

    Here is another idea with findInterval to minimize comparisons and operations. First define a function to accomodate the basic part ignoring the grouping. The following function computes the cumulative sum, and subtracts the cumulative sum at each position from the one at its respective past date:

    ff = function(date, value, minus)
    {
        cs = cumsum(value)  
        i = findInterval(date - minus, date, left.open = TRUE) 
        w = which(as.logical(i))
        i[w] = cs[i[w]]
        cs - i
    }
    

    And apply it by group:

    do.call(rbind, 
            lapply(split(DF2, DF2$user_id), 
                   function(x) data.frame(x, 
                             minus7 = ff(x$date, x$value, 7), 
                             minus14 = ff(x$date, x$value, 14))))
    #      user_id       date value minus7 minus14
    #11.8       11 2016-01-01   6.8    6.8     6.8
    #11.9       11 2016-01-03  21.3   28.1    28.1
    #11.10      11 2016-01-05  19.8   47.9    47.9
    #11.11      11 2016-01-07  22.0   69.9    69.9
    #11.12      11 2016-01-10  19.4   82.5    89.3
    #11.13      11 2016-01-14  17.5   58.9   106.8
    #11.14      11 2016-01-16  19.3   56.2   119.3
    #27.1       27 2016-01-01  15.0   15.0    15.0
    #27.2       27 2016-01-03  22.4   37.4    37.4
    #27.3       27 2016-01-05  13.3   50.7    50.7
    #27.4       27 2016-01-07  21.9   72.6    72.6
    #27.5       27 2016-01-10  20.6   78.2    93.2
    #27.6       27 2016-01-14  18.6   61.1   111.8
    #27.7       27 2016-01-16  16.4   55.6   113.2
    

    The above apply-by-group operation can, of course, be replaced by any method prefereable.

提交回复
热议问题