I\'m looking for a way to \"page through\" a Python iterator. That is, I would like to wrap a given iterator iter and page_size with another iterator that wou
Look at grouper()
, from the itertools recipes.
from itertools import zip_longest
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
Based on the pointer to the itertools recipe for grouper(), I came up with the following adaption of grouper() to mimic Pager. I wanted to filter out any None results and wanted to return an iterator rather than a tuple (though I suspect that there might be little advantage in doing this conversion)
# based on http://docs.python.org/library/itertools.html#recipes
def grouper2(n, iterable, fillvalue=None):
args = [iter(iterable)] * n
for item in izip_longest(fillvalue=fillvalue, *args):
yield iter(filter(None,item))
I'd welcome feedback on how what I can do to improve this code.
I'd do it like this:
def pager(iterable, page_size):
args = [iter(iterable)] * page_size
fillvalue = object()
for group in izip_longest(fillvalue=fillvalue, *args):
yield (elem for elem in group if elem is not fillvalue)
That way, None
can be a legitimate value that the iterator spits out. Only the single object fillvalue
filtered out, and it cannot possibly be an element of the iterable.
def group_by(iterable, size):
"""Group an iterable into lists that don't exceed the size given.
>>> group_by([1,2,3,4,5], 2)
[[1, 2], [3, 4], [5]]
"""
sublist = []
for index, item in enumerate(iterable):
if index > 0 and index % size == 0:
yield sublist
sublist = []
sublist.append(item)
if sublist:
yield sublist
Why aren't you using this?
def grouper( page_size, iterable ):
page= []
for item in iterable:
page.append( item )
if len(page) == page_size:
yield page
page= []
yield page
"Each page would itself be an iterator with up to page_size" items. Each page is a simple list of items, which is iterable. You could use yield iter(page)
to yield the iterator instead of the object, but I don't see how that improves anything.
It throws a standard StopIteration
at the end.
What more would you want?
more_itertools.chunked will do exactly what you're looking for:
>>> import more_itertools
>>> list(chunked([1, 2, 3, 4, 5, 6], 3))
[[1, 2, 3], [4, 5, 6]]
If you want the chunking without creating temporary lists, you can use more_itertools.ichunked
.
That library also has lots of other nice options for efficiently grouping, windowing, slicing, etc.