Python 3.x: Test if generator has elements remaining

后端 未结 4 1219
旧巷少年郎
旧巷少年郎 2021-02-04 00:56

When I use a generator in a for loop, it seems to \"know\", when there are no more elements yielded. Now, I have to use a generator WITHOUT a for loop, and use next

相关标签:
4条回答
  • 2021-02-04 01:31

    It is not possible to know beforehand about end-of-iterator in the general case, because arbitrary code may have to run to decide about the end. Buffering elements could help revealing things at costs - but this is rarely useful.

    In practice the question arises when one wants to take only one or few elements from an iterator for now, but does not want to write that ugly exception handling code (as indicated in the question). Indeed it is non-pythonic to put the concept "StopIteration" into normal application code. And exception handling on python level is rather time-consuming - particularly when it's just about taking one element.

    The pythonic way to handle those situations best is either using for .. break [.. else] like:

    for x in iterator:
        do_something(x)
        break
    else:
        it_was_exhausted()
    

    or using the builtin next() function with default like

    x = next(iterator, default_value)
    

    or using iterator helpers e.g. from itertools module for rewiring things like:

    max_3_elements = list(itertools.islice(iterator, 3))
    

    Some iterators however expose a "length hint" (PEP424) :

    >>> gen = iter(range(3))
    >>> gen.__length_hint__()
    3
    >>> next(gen)
    0
    >>> gen.__length_hint__()
    2
    

    Note: iterator.__next__() should not be used by normal app code. That's why they renamed it from iterator.next() in Python2. And using next() without default is not much better ...

    0 讨论(0)
  • 2021-02-04 01:38

    The two statements you wrote deal with finding the end of the generator in exactly the same way. The for-loop simply calls .next() until the StopIteration exception is raised and then it terminates.

    http://docs.python.org/tutorial/classes.html#iterators

    As such I don't think waiting for the StopIteration exception is a 'heavy' way to deal with the problem, it's the way that generators are designed to be used.

    0 讨论(0)
  • 2021-02-04 01:41

    This may not precisely answer your question, but I found my way here looking to elegantly grab a result from a generator without having to write a try: block. A little googling later I figured this out:

    def g():
        yield 5
    
    result = next(g(), None)
    

    Now result is either 5 or None, depending on how many times you've called next on the iterator, or depending on whether the generator function returned early instead of yielding.

    I strongly prefer handling None as an output over raising for "normal" conditions, so dodging the try/catch here is a big win. If the situation calls for it, there's also an easy place to add a default other than None.

    0 讨论(0)
  • 2021-02-04 01:43

    This is a great question. I'll try to show you how we can use Python's introspective abilities and open source to get an answer. We can use the dis module to peek behind the curtain and see how the CPython interpreter implements a for loop over an iterator.

    >>> def for_loop(iterable):
    ...     for item in iterable:
    ...         pass  # do nothing
    ...     
    >>> import dis
    >>> dis.dis(for_loop)
      2           0 SETUP_LOOP              14 (to 17) 
                  3 LOAD_FAST                0 (iterable) 
                  6 GET_ITER             
            >>    7 FOR_ITER                 6 (to 16) 
                 10 STORE_FAST               1 (item) 
    
      3          13 JUMP_ABSOLUTE            7 
            >>   16 POP_BLOCK            
            >>   17 LOAD_CONST               0 (None) 
                 20 RETURN_VALUE         
    

    The juicy bit appears to be the FOR_ITER opcode. We can't dive any deeper using dis, so let's look up FOR_ITER in the CPython interpreter's source code. If you poke around, you'll find it in Python/ceval.c; you can view it here. Here's the whole thing:

        TARGET(FOR_ITER)
            /* before: [iter]; after: [iter, iter()] *or* [] */
            v = TOP();
            x = (*v->ob_type->tp_iternext)(v);
            if (x != NULL) {
                PUSH(x);
                PREDICT(STORE_FAST);
                PREDICT(UNPACK_SEQUENCE);
                DISPATCH();
            }
            if (PyErr_Occurred()) {
                if (!PyErr_ExceptionMatches(
                                PyExc_StopIteration))
                    break;
                PyErr_Clear();
            }
            /* iterator ended normally */
            x = v = POP();
            Py_DECREF(v);
            JUMPBY(oparg);
            DISPATCH();
    

    Do you see how this works? We try to grab an item from the iterator; if we fail, we check what exception was raised. If it's StopIteration, we clear it and consider the iterator exhausted.

    So how does a for loop "just know" when an iterator has been exhausted? Answer: it doesn't -- it has to try and grab an element. But why?

    Part of the answer is simplicity. Part of the beauty of implementing iterators is that you only have to define one operation: grab the next element. But more importantly, it makes iterators lazy: they'll only produce the values that they absolutely have to.

    Finally, if you are really missing this feature, it's trivial to implement it yourself. Here's an example:

    class LookaheadIterator:
    
        def __init__(self, iterable):
            self.iterator = iter(iterable)
            self.buffer = []
    
        def __iter__(self):
            return self
    
        def __next__(self):
            if self.buffer:
                return self.buffer.pop()
            else:
                return next(self.iterator)
    
        def has_next(self):
            if self.buffer:
                return True
    
            try:
                self.buffer = [next(self.iterator)]
            except StopIteration:
                return False
            else:
                return True
    
    
    x  = LookaheadIterator(range(2))
    
    print(x.has_next())
    print(next(x))
    print(x.has_next())
    print(next(x))
    print(x.has_next())
    print(next(x))
    
    0 讨论(0)
提交回复
热议问题