Rolling sums for groups with uneven time gaps

前端 未结 6 1406
说谎
说谎 2021-02-16 00:19

Here\'s the tweak to my previously posted question. Here\'s my data:

set.seed(3737)
DF2 = data.frame(user_id = c(rep(27, 7), rep(11, 7)),
            date = as.D         


        
6条回答
  •  面向向阳花
    2021-02-16 00:37

    Here is a new option using dplyr and tbrf

    library(tbrf)
    library(dplyr)
    set.seed(3737)
    DF2 = data.frame(user_id = c(rep(27, 7), rep(11, 7)),
                     date = as.Date(rep(c('2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07', '2016-01-10', '2016-01-14', '2016-01-16'), 2)),
                     value = round(rnorm(14, 15, 5), 1))
    
    DF2 %>%
      group_by(user_id) %>%
      tbrf::tbr_sum(value, date, unit = "days", n = 7) %>%
      arrange(user_id, date) %>%
      rename(v_minus7 = sum) %>%
      tbrf::tbr_sum(value, date, unit = "days", n = 14) %>%
      rename(v_minus14 = sum)
    

    Creates a tibble:

    # A tibble: 14 x 5
       user_id date       value v_minus7 v_minus14
                        
     1      11 2016-01-01   6.8      6.8      21.8
     2      27 2016-01-01  15       15        21.8
     3      11 2016-01-03  21.3     28.1      65.5
     4      27 2016-01-03  22.4     37.4      65.5
     5      11 2016-01-05  19.8     47.9      98.6
     6      27 2016-01-05  13.3     50.7      98.6
     7      11 2016-01-07  22       69.9     142. 
     8      27 2016-01-07  21.9     72.6     142. 
     9      11 2016-01-10  19.4     82.5     182. 
    10      27 2016-01-10  20.6     78.2     182. 
    11      11 2016-01-14  17.5     58.9     219. 
    12      27 2016-01-14  18.6     61.1     219. 
    13      11 2016-01-16  19.3     56.2     232. 
    14      27 2016-01-16  16.4     55.6     232. 
    

    I suspect this isn't the fastest solution with larger datasets, but it works well in dplyr chains.

提交回复
热议问题