How to generate rolling mean with grouped data. Here\'s the data
set.seed(31)
dd<-matrix(sample(seq(1:20),30,replace=TRUE),ncol=3)
Add
The only thing missing is a do.call(rbind,d2_roll_mean)
. Add original data:
cbind(d,do.call(rbind,d2_roll_mean))
EDIT: I ran this through system.time()
for a bigger example, and it does take its sweet time:
set.seed(31)
dd <- matrix(sample(seq(1:20),20000*500,replace=TRUE),ncol=500)
du <- sample(seq(1:350),20000,replace=TRUE)
d <- cbind(du,dd)
d <- d[order(d[,1]),]
system.time(d2_roll_mean <- by(d[,-1], d[,1], doApply))
User System elapsed
399.60 0.57 409.91
by()
and apply()
are not the fastest functions. It may actually be faster to walk through the columns using a for
loop and doing this by brute force, relying on the fact that d
is sorted by ID.
Using data.table
and caTools
library(data.table)
library(caTools)
DT <- data.table(d, key='du')
DT[, lapply(.SD, function(y)
runmean(y, 3, alg='fast',align='right')), by=du]
If you want to create new columns in the existing dataset
nm1 <- paste0('V', 2:4)
nm2 <- paste0("V", 4:6)
DT[, (nm1):=lapply(.SD, as.numeric), .SDcols=nm1][,
(nm2):=lapply(.SD, function(y) runmean(y, 3, alg='fast',
align='right')), by=du]