Rolling / moving avg by group

前端 未结 2 573
醉话见心
醉话见心 2021-01-12 15:51

How to generate rolling mean with grouped data. Here\'s the data

set.seed(31)
dd<-matrix(sample(seq(1:20),30,replace=TRUE),ncol=3)

Add

相关标签:
2条回答
  • 2021-01-12 16:36

    The only thing missing is a do.call(rbind,d2_roll_mean). Add original data:

    cbind(d,do.call(rbind,d2_roll_mean))
    

    EDIT: I ran this through system.time() for a bigger example, and it does take its sweet time:

    set.seed(31)
    dd <- matrix(sample(seq(1:20),20000*500,replace=TRUE),ncol=500)
    du <- sample(seq(1:350),20000,replace=TRUE)
    d <- cbind(du,dd)
    d <- d[order(d[,1]),]
    
    system.time(d2_roll_mean <- by(d[,-1], d[,1], doApply))
           User      System      elapsed 
         399.60        0.57       409.91
    

    by() and apply() are not the fastest functions. It may actually be faster to walk through the columns using a for loop and doing this by brute force, relying on the fact that d is sorted by ID.

    0 讨论(0)
  • 2021-01-12 16:41

    Using data.table and caTools

    library(data.table)
    library(caTools)
    DT <- data.table(d, key='du')
    DT[, lapply(.SD, function(y) 
           runmean(y, 3, alg='fast',align='right')), by=du]
    

    Update

    If you want to create new columns in the existing dataset

     nm1 <- paste0('V', 2:4)
     nm2 <- paste0("V", 4:6)
     DT[, (nm1):=lapply(.SD, as.numeric), .SDcols=nm1][,
          (nm2):=lapply(.SD, function(y) runmean(y, 3, alg='fast',
                                 align='right')), by=du]
    
    0 讨论(0)
提交回复
热议问题