I have been really fascinated by all the interesting iterators in itertools
, but one confusion I have had is the difference between these two functions and why
*
unpacks the iterator, meaning it iterates the iterator in order to pass its values to the function. chain.from_iterable
iterates the iterator one by one lazily.
chain(*foo(5))
unpacks the whole generator, packs it into a tuple and processes it then.
chain.from_iterable(foo(5))
queries the generator created from foo(5)
value for value.
Try foo(1000000)
and watch the memory usage go up and up.
The former can only handle unpackable iterables. The latter can handle iterables that cannot be fully unpacked, such as infinite generators.
Consider
>>> from itertools import chain
>>> def inf():
... i=0
... while True:
... i += 1
... yield (i, i)
...
>>> x=inf()
>>> y=chain.from_iterable(x)
>>> z=chain(*x)
<hangs forever>
Furthermore, just the act of unpacking is an eager, up-front-cost activity, so if your iterable has effects you want to evaluate lazily, from_iterable
is your best option.