split a generator/iterable every n items in python (splitEvery)

前端 未结 13 1307
臣服心动
臣服心动 2020-11-27 16:44

I\'m trying to write the Haskel function \'splitEvery\' in Python. Here is it\'s definition:

splitEvery :: Int -> [e] -> [[e]]
    @\'splitEvery\' n@ s         


        
相关标签:
13条回答
  • 2020-11-27 17:22

    A one-liner, inlineable solution to this (supports v2/v3, iterators, uses standard library and a single generator comprehension):

    import itertools
    def split_groups(iter_in, group_size):
         return ((x for _, x in item) for _, item in itertools.groupby(enumerate(iter_in), key=lambda x: x[0] // group_size))
    
    0 讨论(0)
  • 2020-11-27 17:25

    I came across this as I'm trying to chop up batches too, but doing it on a generator from a stream, so most of the solutions here aren't applicable, or don't work in python 3.

    For people still stumbling upon this, here's a general solution using itertools:

    from itertools import islice, chain
    
    def iter_in_slices(iterator, size=None):
        while True:
            slice_iter = islice(iterator, size)
            # If no first object this is how StopIteration is triggered
            peek = next(slice_iter)
            # Put the first object back and return slice
            yield chain([peek], slice_iter)
    
    0 讨论(0)
  • 2020-11-27 17:25

    this will do the trick

    from itertools import izip_longest
    izip_longest(it[::2], it[1::2])
    

    where *it* is some iterable


    Example:

    izip_longest('abcdef'[::2], 'abcdef'[1::2]) -> ('a', 'b'), ('c', 'd'), ('e', 'f')
    

    Let's break this down

    'abcdef'[::2] -> 'ace'
    'abcdef'[1::2] -> 'bdf'
    

    As you can see the last number in the slice is specifying the interval that will be used to pick up items. You can read more about using extended slices here.

    The zip function takes the first item from the first iterable and combines it with the first item with the second iterable. The zip function then does the same thing for the second and third items until one of the iterables runs out of values.

    The result is an iterator. If you want a list use the list() function on the result.

    0 讨论(0)
  • 2020-11-27 17:31
    def chunks(iterable,n):
        """assumes n is an integer>0
        """
        iterable=iter(iterable)
        while True:
            result=[]
            for i in range(n):
                try:
                    a=next(iterable)
                except StopIteration:
                    break
                else:
                    result.append(a)
            if result:
                yield result
            else:
                break
    
    g1=(i*i for i in range(10))
    g2=chunks(g1,3)
    print g2
    '<generator object chunks at 0x0337B9B8>'
    print list(g2)
    '[[0, 1, 4], [9, 16, 25], [36, 49, 64], [81]]'
    
    0 讨论(0)
  • 2020-11-27 17:32

    more_itertools has a chunked function:

    import more_itertools as mit
    
    
    list(mit.chunked(range(9), 5))
    # [[0, 1, 2, 3, 4], [5, 6, 7, 8]]
    
    0 讨论(0)
  • 2020-11-27 17:33

    Here's a quick one-liner version. Like Haskell's, it is lazy.

    from itertools import islice, takewhile, repeat
    split_every = (lambda n, it:
        takewhile(bool, (list(islice(it, n)) for _ in repeat(None))))
    

    This requires that you use iter before calling split_every.

    Example:

    list(split_every(5, iter(xrange(9))))
    [[0, 1, 2, 3, 4], [5, 6, 7, 8]]
    

    Although not a one-liner, the version below doesn't require that you call iter which can be a common pitfall.

    from itertools import islice, takewhile, repeat
    
    def split_every(n, iterable):
        """
        Slice an iterable into chunks of n elements
        :type n: int
        :type iterable: Iterable
        :rtype: Iterator
        """
        iterator = iter(iterable)
        return takewhile(bool, (list(islice(iterator, n)) for _ in repeat(None)))
    

    (Thanks to @eli-korvigo for improvements.)

    0 讨论(0)
提交回复
热议问题