Why do Python yield statements form a closure?

前端 未结 3 755
情书的邮戳
情书的邮戳 2021-02-01 17:52

I have two functions that return a list of functions. The functions take in a number x and add i to it. i is an integer increasing from 0-

相关标签:
3条回答
  • 2021-02-01 18:24

    No, yielding has nothing to do with closures.

    Here is how to recognize closures in Python: a closure is

    1. a function

    2. in which an unqualified name lookup is performed

    3. no binding of the name exists in the function itself

    4. but a binding of the name exists in the local scope of a function whose definition surrounds the definition of the function in which the name is looked up.

    The reason for the difference in behaviour you observe is laziness, rather than anything to do with closures. Compare and contrast the following

    def lazy():
        return ( lambda x: x+i for i in range(10) )
    
    def immediate():
        return [ lambda x: x+i for i in range(10) ]
    
    def also_lazy():
        for i in range(10):
            yield lambda x:x+i
    
    not_lazy_any_more = list(also_lazy())
    
    print( [ f(10) for f in lazy()             ] ) # 10 -> 19
    print( [ f(10) for f in immediate()        ] ) # all 19
    print( [ f(10) for f in also_lazy()        ] ) # 10 -> 19
    print( [ f(10) for f in not_lazy_any_more  ] ) # all 19 
    

    Notice that the first and third examples give identical results, as do the second and the fourth. The first and third are lazy, the second and fourth are not.

    Note that all four examples provide a bunch of closures over the most recent binding of i, it's just that in the first an third case you evaluate the closures before rebinding i (even before you've created the next closure in the sequence), while in the second and fourth case, you first wait until i has been rebound to 9 (after you've created and collected all the closures you are going to make), and only then evaluate the closures.

    0 讨论(0)
  • 2021-02-01 18:30

    Yielding does not create a closure in Python, lambdas create a closure. The reason that you get all 9s in "test_without_closure" isn't that there's no closure. If there weren't, you wouldn't be able to access i at all. The problem is that all closures contain a reference¹ to the same i variable, which will be 9 at the end of the function.

    This situation isn't much different in test_with_yield. Why, then, do you get different results? Because yield suspends the run of the function, so it's possible to use the yielded lambdas before the end of the function is reached, i.e. before i is 9. To see what this means, consider the following two examples of using test_with_yield:

    [f(0) for f in test_with_yield()]
    # Result: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    
    [f(0) for f in list(test_with_yield())]
    # Result: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
    

    What's happening here is that the first example yields a lambda (while i is 0), calls it (i is still 0), then advances the function until another lambda is yielded (i is now 1), calls the lambda, and so on. The important thing is that each lambda is called before the control flow returns to test_with_yield (i.e. before the value of i changes).

    In the second example, we first create a list. So the first lambda is yielded (i is 0) and put into the list, the second lambda is created (i is now 1) and put into the list ... until the last lambda is yielded (i is now 9) and put into the list. And then we start calling the lambdas. So since i is now 9, all lambdas return 9.


    ¹ The important bit here is that closures hold references to variables, not copies of the value they held when the closure was created. This way, if you assign to the variable inside a lambda (or inner function, which create closures the same way that lambdas do), this will also change the variable outside of the lambda and if you change the value outside, that change will be visible inside the lambda.

    0 讨论(0)
  • 2021-02-01 18:46

    Adding to @sepp2k's answer you're seeing these two different behaviours because the lambda functions being created don't know from where they have to get i's value. At the time this function is created all it knows is that it has to either fetch i's value from either local scope, enclosed scope, global scope or builtins.

    In this particular case it is a closure variable(enclosed scope). And its value is changing with each iteration.


    Check out LEGB in Python.


    Now to why second one works as expected but not the first one?

    It's because each time you're yielding a lambda function the execution of the generator function stops at that moment and when you're invoking it and it will use the value of i at that moment. But in the first case we have already advanced i's value to 9 before we invoked any of the functions.

    To prove it you can fetch current value of i from the __closure__'s cell contents:

    >>> for func in test_with_yield():
            print "Current value of i is {}".format(func.__closure__[0].cell_contents)
            print func(9)
    ...
    Current value of i is 0
    Current value of i is 1
    Current value of i is 2
    Current value of i is 3
    Current value of i is 4
    Current value of i is 5
    Current value of i is 6
    ...
    

    But instead if you store the functions somewhere and call them later then you will see the same behaviour as the first time:

    from itertools import islice
    
    funcs = []
    for func in islice(test_with_yield(), 4):
        print "Current value of i is {}".format(func.__closure__[0].cell_contents)
        funcs.append(func)
    
    print '-' * 20
    
    for func in funcs:
        print "Now value of i is {}".format(func.__closure__[0].cell_contents)
    

    Output:

    Current value of i is 0
    Current value of i is 1
    Current value of i is 2
    Current value of i is 3
    --------------------
    Now value of i is 3
    Now value of i is 3
    Now value of i is 3
    Now value of i is 3
    

    Example used by Patrick Haugh in comments also shows the same thing: sum(t(1) for t in list(test_with_yield()))


    Correct way:

    Assign i as a default value to lambda, default values are calculated when function is created and they won't change(unless it's a mutable object). i is now a local variable to the lambda functions.

    >>> def test_without_closure():
            return [lambda x, i=i: x+i for i in range(10)]
    ...
    >>> sum(t(1) for t in test_without_closure())
    55
    
    0 讨论(0)
提交回复
热议问题