Is there a SciPy function or NumPy function or module for Python that calculates the running mean of a 1D array given a specific window?
I know this is an old question, but here is a solution that doesn't use any extra data structures or libraries. It is linear in the number of elements of the input list and I cannot think of any other way to make it more efficient (actually if anyone knows of a better way to allocate the result, please let me know).
NOTE: this would be much faster using a numpy array instead of a list, but I wanted to eliminate all dependencies. It would also be possible to improve performance by multi-threaded execution
The function assumes that the input list is one dimensional, so be careful.
### Running mean/Moving average
def running_mean(l, N):
sum = 0
result = list( 0 for x in l)
for i in range( 0, N ):
sum = sum + l[i]
result[i] = sum / (i+1)
for i in range( N, len(l) ):
sum = sum - l[i-N] + l[i]
result[i] = sum / N
return result
Example
Assume that we have a list data = [ 1, 2, 3, 4, 5, 6 ]
on which we want to compute a rolling mean with period of 3, and that you also want a output list that is the same size of the input one (that's most often the case).
The first element has index 0, so the rolling mean should be computed on elements of index -2, -1 and 0. Obviously we don't have data[-2] and data[-1] (unless you want to use special boundary conditions), so we assume that those elements are 0. This is equivalent to zero-padding the list, except we don't actually pad it, just keep track of the indices that require padding (from 0 to N-1).
So, for the first N elements we just keep adding up the elements in an accumulator.
result[0] = (0 + 0 + 1) / 3 = 0.333 == (sum + 1) / 3
result[1] = (0 + 1 + 2) / 3 = 1 == (sum + 2) / 3
result[2] = (1 + 2 + 3) / 3 = 2 == (sum + 3) / 3
From elements N+1 forwards simple accumulation doesn't work. we expect result[3] = (2 + 3 + 4)/3 = 3
but this is different from (sum + 4)/3 = 3.333
.
The way to compute the correct value is to subtract data[0] = 1
from sum+4
, thus giving sum + 4 - 1 = 9
.
This happens because currently sum = data[0] + data[1] + data[2]
, but it is also true for every i >= N
because, before the subtraction, sum
is data[i-N] + ... + data[i-2] + data[i-1]
.