Rolling / moving avg by group

前端未结

关注

 2  574

How to generate rolling mean with grouped data. Here\'s the data

set.seed(31)
dd<-matrix(sample(seq(1:20),30,replace=TRUE),ncol=3)

Add

相关标签:

2条回答

走了就别回头了

2021-01-12 16:36
The only thing missing is a do.call(rbind,d2_roll_mean). Add original data:
```
cbind(d,do.call(rbind,d2_roll_mean))
```
EDIT: I ran this through system.time() for a bigger example, and it does take its sweet time:
```
set.seed(31)
dd <- matrix(sample(seq(1:20),20000*500,replace=TRUE),ncol=500)
du <- sample(seq(1:350),20000,replace=TRUE)
d <- cbind(du,dd)
d <- d[order(d[,1]),]

system.time(d2_roll_mean <- by(d[,-1], d[,1], doApply))
       User      System      elapsed 
     399.60        0.57       409.91
```
by() and apply() are not the fastest functions. It may actually be faster to walk through the columns using a for loop and doing this by brute force, relying on the fact that d is sorted by ID.
0 讨论(0)
发布评论:

提交评论
- 加载中...

误落风尘

2021-01-12 16:41

Using data.table and caTools

library(data.table)
library(caTools)
DT <- data.table(d, key='du')
DT[, lapply(.SD, function(y) 
       runmean(y, 3, alg='fast',align='right')), by=du]

Update

If you want to create new columns in the existing dataset

 nm1 <- paste0('V', 2:4)
 nm2 <- paste0("V", 4:6)
 DT[, (nm1):=lapply(.SD, as.numeric), .SDcols=nm1][,
      (nm2):=lapply(.SD, function(y) runmean(y, 3, alg='fast',
                             align='right')), by=du]

0 讨论(0)