Rolling or sliding window iterator?

后端 未结 23 1407
南方客
南方客 2020-11-21 05:23

I need a rolling window (aka sliding window) iterable over a sequence/iterator/generator. Default Python iteration can be considered a special case, where the window length

相关标签:
23条回答
  • 2020-11-21 06:02

    There's one in an old version of the Python docs with itertools examples:

    from itertools import islice
    
    def window(seq, n=2):
        "Returns a sliding window (of width n) over data from the iterable"
        "   s -> (s0,s1,...s[n-1]), (s1,s2,...,sn), ...                   "
        it = iter(seq)
        result = tuple(islice(it, n))
        if len(result) == n:
            yield result
        for elem in it:
            result = result[1:] + (elem,)
            yield result
    

    The one from the docs is a little more succinct and uses itertools to greater effect I imagine.

    0 讨论(0)
  • 2020-11-21 06:02

    Here's a generalization that adds support for step, fillvalue parameters:

    from collections import deque
    from itertools import islice
    
    def sliding_window(iterable, size=2, step=1, fillvalue=None):
        if size < 0 or step < 1:
            raise ValueError
        it = iter(iterable)
        q = deque(islice(it, size), maxlen=size)
        if not q:
            return  # empty iterable or size == 0
        q.extend(fillvalue for _ in range(size - len(q)))  # pad to size
        while True:
            yield iter(q)  # iter() to avoid accidental outside modifications
            try:
                q.append(next(it))
            except StopIteration: # Python 3.5 pep 479 support
                return
            q.extend(next(it, fillvalue) for _ in range(step - 1))
    

    It yields in chunks size items at a time rolling step positions per iteration padding each chunk with fillvalue if necessary. Example for size=4, step=3, fillvalue='*':

     [a b c d]e f g h i j k l m n o p q r s t u v w x y z
      a b c[d e f g]h i j k l m n o p q r s t u v w x y z
      a b c d e f[g h i j]k l m n o p q r s t u v w x y z
      a b c d e f g h i[j k l m]n o p q r s t u v w x y z
      a b c d e f g h i j k l[m n o p]q r s t u v w x y z
      a b c d e f g h i j k l m n o[p q r s]t u v w x y z
      a b c d e f g h i j k l m n o p q r[s t u v]w x y z
      a b c d e f g h i j k l m n o p q r s t u[v w x y]z
      a b c d e f g h i j k l m n o p q r s t u v w x[y z * *]
    

    For an example of use case for the step parameter, see Processing a large .txt file in python efficiently.

    0 讨论(0)
  • 2020-11-21 06:03

    Just a quick contribution.

    Since the current python docs don't have "window" in the itertool examples (i.e., at the bottom of http://docs.python.org/library/itertools.html), here's an snippet based on the code for grouper which is one of the examples given:

    import itertools as it
    def window(iterable, size):
        shiftedStarts = [it.islice(iterable, s, None) for s in xrange(size)]
        return it.izip(*shiftedStarts)
    

    Basically, we create a series of sliced iterators, each with a starting point one spot further forward. Then, we zip these together. Note, this function returns a generator (it is not directly a generator itself).

    Much like the appending-element and advancing-iterator versions above, the performance (i.e., which is best) varies with list size and window size. I like this one because it is a two-liner (it could be a one-liner, but I prefer naming concepts).

    It turns out that the above code is wrong. It works if the parameter passed to iterable is a sequence but not if it is an iterator. If it is an iterator, the same iterator is shared (but not tee'd) among the islice calls and this breaks things badly.

    Here is some fixed code:

    import itertools as it
    def window(iterable, size):
        itrs = it.tee(iterable, size)
        shiftedStarts = [it.islice(anItr, s, None) for s, anItr in enumerate(itrs)]
        return it.izip(*shiftedStarts)
    

    Also, one more version for the books. Instead of copying an iterator and then advancing copies many times, this version makes pairwise copies of each iterator as we move the starting position forward. Thus, iterator t provides both the "complete" iterator with starting point at t and also the basis for creating iterator t + 1:

    import itertools as it
    def window4(iterable, size):
        complete_itr, incomplete_itr = it.tee(iterable, 2)
        iters = [complete_itr]
        for i in xrange(1, size):
            incomplete_itr.next()
            complete_itr, incomplete_itr = it.tee(incomplete_itr, 2)
            iters.append(complete_itr)
        return it.izip(*iters)
    
    0 讨论(0)
  • 2020-11-21 06:04

    here is a one liner. I timed it and it's comprable to the performance of the top answer and gets progressively better with larger seq from 20% slower with len(seq) = 20 and 7% slower with len(seq) = 10000

    zip(*[seq[i:(len(seq) - n - 1 + i)] for i in range(n)])
    
    0 讨论(0)
  • 2020-11-21 06:05

    I tested a few solutions and one I came up with and found the one I came up with to be the fastest so I thought I would share it.

    import itertools
    import sys
    
    def windowed(l, stride):
        return zip(*[itertools.islice(l, i, sys.maxsize) for i in range(stride)])
    
    0 讨论(0)
  • 2020-11-21 06:06

    This is an old question but for those still interested there is a great implementation of a window slider using generators in this page (by Adrian Rosebrock).

    It is an implementation for OpenCV however you can easily use it for any other purpose. For the eager ones i'll paste the code here but to understand it better I recommend visiting the original page.

    def sliding_window(image, stepSize, windowSize):
        # slide a window across the image
        for y in xrange(0, image.shape[0], stepSize):
            for x in xrange(0, image.shape[1], stepSize):
                # yield the current window
                yield (x, y, image[y:y + windowSize[1], x:x + windowSize[0]])
    

    Tip: You can check the .shape of the window when iterating the generator to discard those that do not meet your requirements

    Cheers

    0 讨论(0)
提交回复
热议问题