I need a rolling window (aka sliding window) iterable over a sequence/iterator/generator. Default Python iteration can be considered a special case, where the window length
Just to show how you can combine itertools recipes, I'm extending the pairwise
recipe as directly as possible back into the window
recipe using the consume
recipe:
def consume(iterator, n):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
def window(iterable, n=2):
"s -> (s0, ...,s(n-1)), (s1, ...,sn), (s2, ..., s(n+1)), ..."
iters = tee(iterable, n)
# Could use enumerate(islice(iters, 1, None), 1) to avoid consume(it, 0), but that's
# slower for larger window sizes, while saving only small fixed "noop" cost
for i, it in enumerate(iters):
consume(it, i)
return zip(*iters)
The window
recipe is the same as for pairwise
, it just replaces the single element "consume" on the second tee
-ed iterator with progressively increasing consumes on n - 1
iterators. Using consume
instead of wrapping each iterator in islice
is marginally faster (for sufficiently large iterables) since you only pay the islice
wrapping overhead during the consume
phase, not during the process of extracting each window-ed value (so it's bounded by n
, not the number of items in iterable
).
Performance-wise, compared to some other solutions, this is pretty good (and better than any of the other solutions I tested as it scales). Tested on Python 3.5.0, Linux x86-64, using ipython
%timeit
magic.
kindall's the deque solution, tweaked for performance/correctness by using islice
instead of a home-rolled generator expression and testing the resulting length so it doesn't yield results when the iterable is shorter than the window, as well as passing the maxlen
of the deque
positionally instead of by keyword (makes a surprising difference for smaller inputs):
>>> %timeit -r5 deque(windowkindall(range(10), 3), 0)
100000 loops, best of 5: 1.87 μs per loop
>>> %timeit -r5 deque(windowkindall(range(1000), 3), 0)
10000 loops, best of 5: 72.6 μs per loop
>>> %timeit -r5 deque(windowkindall(range(1000), 30), 0)
1000 loops, best of 5: 71.6 μs per loop
Same as previous adapted kindall solution, but with each yield win
changed to yield tuple(win)
so storing results from the generator works without all stored results really being a view of the most recent result (all other reasonable solutions are safe in this scenario), and adding tuple=tuple
to the function definition to move use of tuple
from the B
in LEGB
to the L
:
>>> %timeit -r5 deque(windowkindalltupled(range(10), 3), 0)
100000 loops, best of 5: 3.05 μs per loop
>>> %timeit -r5 deque(windowkindalltupled(range(1000), 3), 0)
10000 loops, best of 5: 207 μs per loop
>>> %timeit -r5 deque(windowkindalltupled(range(1000), 30), 0)
1000 loops, best of 5: 348 μs per loop
consume
-based solution shown above:
>>> %timeit -r5 deque(windowconsume(range(10), 3), 0)
100000 loops, best of 5: 3.92 μs per loop
>>> %timeit -r5 deque(windowconsume(range(1000), 3), 0)
10000 loops, best of 5: 42.8 μs per loop
>>> %timeit -r5 deque(windowconsume(range(1000), 30), 0)
1000 loops, best of 5: 232 μs per loop
Same as consume
, but inlining else
case of consume
to avoid function call and n is None
test to reduce runtime, particularly for small inputs where the setup overhead is a meaningful part of the work:
>>> %timeit -r5 deque(windowinlineconsume(range(10), 3), 0)
100000 loops, best of 5: 3.57 μs per loop
>>> %timeit -r5 deque(windowinlineconsume(range(1000), 3), 0)
10000 loops, best of 5: 40.9 μs per loop
>>> %timeit -r5 deque(windowinlineconsume(range(1000), 30), 0)
1000 loops, best of 5: 211 μs per loop
(Side-note: A variant on pairwise
that uses tee
with the default argument of 2 repeatedly to make nested tee
objects, so any given iterator is only advanced once, not independently consumed an increasing number of times, similar to MrDrFenner's answer is similar to non-inlined consume
and slower than the inlined consume
on all tests, so I've omitted it those results for brevity).
As you can see, if you don't care about the possibility of the caller needing to store results, my optimized version of kindall's solution wins most of the time, except in the "large iterable, small window size case" (where inlined consume
wins); it degrades quickly as the iterable size increases, while not degrading at all as the window size increases (every other solution degrades more slowly for iterable size increases, but also degrades for window size increases). It can even be adapted for the "need tuples" case by wrapping in map(tuple, ...)
, which runs ever so slightly slower than putting the tupling in the function, but it's trivial (takes 1-5% longer) and lets you keep the flexibility of running faster when you can tolerate repeatedly returning the same value.
If you need safety against returns being stored, inlined consume
wins on all but the smallest input sizes (with non-inlined consume
being slightly slower but scaling similarly). The deque
& tupling based solution wins only for the smallest inputs, due to smaller setup costs, and the gain is small; it degrades badly as the iterable gets longer.
For the record, the adapted version of kindall's solution that yield
s tuple
s I used was:
def windowkindalltupled(iterable, n=2, tuple=tuple):
it = iter(iterable)
win = deque(islice(it, n), n)
if len(win) < n:
return
append = win.append
yield tuple(win)
for e in it:
append(e)
yield tuple(win)
Drop the caching of tuple
in the function definition line and the use of tuple
in each yield
to get the faster but less safe version.