I need a rolling window (aka sliding window) iterable over a sequence/iterator/generator. Default Python iteration can be considered a special case, where the window length
There's one in an old version of the Python docs with itertools examples:
from itertools import islice
def window(seq, n=2):
"Returns a sliding window (of width n) over data from the iterable"
" s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ... "
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
The one from the docs is a little more succinct and uses itertools
to greater effect I imagine.
Here's a generalization that adds support for step
, fillvalue
parameters:
from collections import deque
from itertools import islice
def sliding_window(iterable, size=2, step=1, fillvalue=None):
if size < 0 or step < 1:
raise ValueError
it = iter(iterable)
q = deque(islice(it, size), maxlen=size)
if not q:
return # empty iterable or size == 0
q.extend(fillvalue for _ in range(size - len(q))) # pad to size
while True:
yield iter(q) # iter() to avoid accidental outside modifications
try:
q.append(next(it))
except StopIteration: # Python 3.5 pep 479 support
return
q.extend(next(it, fillvalue) for _ in range(step - 1))
It yields in chunks size
items at a time rolling step
positions per iteration padding each chunk with fillvalue
if necessary. Example for size=4, step=3, fillvalue='*'
:
[a b c d]e f g h i j k l m n o p q r s t u v w x y z
a b c[d e f g]h i j k l m n o p q r s t u v w x y z
a b c d e f[g h i j]k l m n o p q r s t u v w x y z
a b c d e f g h i[j k l m]n o p q r s t u v w x y z
a b c d e f g h i j k l[m n o p]q r s t u v w x y z
a b c d e f g h i j k l m n o[p q r s]t u v w x y z
a b c d e f g h i j k l m n o p q r[s t u v]w x y z
a b c d e f g h i j k l m n o p q r s t u[v w x y]z
a b c d e f g h i j k l m n o p q r s t u v w x[y z * *]
For an example of use case for the step
parameter, see Processing a large .txt file in python efficiently.
Just a quick contribution.
Since the current python docs don't have "window" in the itertool examples (i.e., at the bottom of http://docs.python.org/library/itertools.html), here's an snippet based on the code for grouper which is one of the examples given:
import itertools as it
def window(iterable, size):
shiftedStarts = [it.islice(iterable, s, None) for s in xrange(size)]
return it.izip(*shiftedStarts)
Basically, we create a series of sliced iterators, each with a starting point one spot further forward. Then, we zip these together. Note, this function returns a generator (it is not directly a generator itself).
Much like the appending-element and advancing-iterator versions above, the performance (i.e., which is best) varies with list size and window size. I like this one because it is a two-liner (it could be a one-liner, but I prefer naming concepts).
It turns out that the above code is wrong. It works if the parameter passed to iterable is a sequence but not if it is an iterator. If it is an iterator, the same iterator is shared (but not tee'd) among the islice calls and this breaks things badly.
Here is some fixed code:
import itertools as it
def window(iterable, size):
itrs = it.tee(iterable, size)
shiftedStarts = [it.islice(anItr, s, None) for s, anItr in enumerate(itrs)]
return it.izip(*shiftedStarts)
Also, one more version for the books. Instead of copying an iterator and then advancing copies many times, this version makes pairwise copies of each iterator as we move the starting position forward. Thus, iterator t provides both the "complete" iterator with starting point at t and also the basis for creating iterator t + 1:
import itertools as it
def window4(iterable, size):
complete_itr, incomplete_itr = it.tee(iterable, 2)
iters = [complete_itr]
for i in xrange(1, size):
incomplete_itr.next()
complete_itr, incomplete_itr = it.tee(incomplete_itr, 2)
iters.append(complete_itr)
return it.izip(*iters)
here is a one liner. I timed it and it's comprable to the performance of the top answer and gets progressively better with larger seq from 20% slower with len(seq) = 20 and 7% slower with len(seq) = 10000
zip(*[seq[i:(len(seq) - n - 1 + i)] for i in range(n)])
I tested a few solutions and one I came up with and found the one I came up with to be the fastest so I thought I would share it.
import itertools
import sys
def windowed(l, stride):
return zip(*[itertools.islice(l, i, sys.maxsize) for i in range(stride)])
This is an old question but for those still interested there is a great implementation of a window slider using generators in this page (by Adrian Rosebrock).
It is an implementation for OpenCV however you can easily use it for any other purpose. For the eager ones i'll paste the code here but to understand it better I recommend visiting the original page.
def sliding_window(image, stepSize, windowSize):
# slide a window across the image
for y in xrange(0, image.shape[0], stepSize):
for x in xrange(0, image.shape[1], stepSize):
# yield the current window
yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])
Tip: You can check the .shape
of the window when iterating the generator to discard those that do not meet your requirements
Cheers