I\'m trying to use R to calculate the moving average over a series of values in a matrix. The normal R mailing list search hasn\'t been very helpful though. There doesn\'t s
The slider package can be used for this. It has an interface that has been specifically designed to feel similar to purrr. It accepts any arbitrary function, and can return any type of output. Data frames are even iterated over row wise. The pkgdown site is here.
library(slider)
x <- 1:3
# Mean of the current value + 1 value before it
# returned as a double vector
slide_dbl(x, ~mean(.x, na.rm = TRUE), .before = 1)
#> [1] 1.0 1.5 2.5
df <- data.frame(x = x, y = x)
# Slide row wise over data frames
slide(df, ~.x, .before = 1)
#> [[1]]
#> x y
#> 1 1 1
#>
#> [[2]]
#> x y
#> 1 1 1
#> 2 2 2
#>
#> [[3]]
#> x y
#> 1 2 2
#> 2 3 3
The overhead of both slider and data.table's frollapply()
should be pretty low (much faster than zoo). frollapply()
looks to be a little faster for this simple example here, but note that it only takes numeric input, and the output must be a scalar numeric value. slider functions are completely generic, and you can return any data type.
library(slider)
library(zoo)
library(data.table)
x <- 1:50000 + 0L
bench::mark(
slider = slide_int(x, function(x) 1L, .before = 5, .complete = TRUE),
zoo = rollapplyr(x, FUN = function(x) 1L, width = 6, fill = NA),
datatable = frollapply(x, n = 6, FUN = function(x) 1L),
iterations = 200
)
#> # A tibble: 3 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#>
#> 1 slider 19.82ms 26.4ms 38.4 829.8KB 19.0
#> 2 zoo 177.92ms 211.1ms 4.71 17.9MB 24.8
#> 3 datatable 7.78ms 10.9ms 87.9 807.1KB 38.7