Possible bug in pdb module in Python 3 when using list generators

十年热恋 提交于 2019-11-28 21:20:32

if you type interact in your [i]pdb session, you get an interactive session, and list comprehensions do work as expected in this mode

source: http://bugs.python.org/msg215963

It works perfectly fine:

>>> import pdb
>>> def f(seq):
...     pdb.set_trace()
... 
>>> f([1,2,3])
--Return--
> <stdin>(2)f()->None
(Pdb) [x for x in seq]
[1, 2, 3]
(Pdb) [x in seq for x in seq]
[True, True, True]

Without showing what you are actually doing nobody can tell you why in your specific case you got a NameError.


TL;DR In python3 list-comprehensions are actually functions with their own stack frame, and you cannot access the seq variable, which is an argument of test, from inner stack frames. It is instead treated as a global (and, hence, not found).


What you see is the different implementation of list-comprehension in python2 vs python3. In python 2 list-comprehensions are actually a short-hand for the for loop, and you can clearly see this in the bytecode:

>>> def test(): [x in seq for x in seq]
... 
>>> dis.dis(test)
  1           0 BUILD_LIST               0
              3 LOAD_GLOBAL              0 (seq)
              6 GET_ITER            
        >>    7 FOR_ITER                18 (to 28)
             10 STORE_FAST               0 (x)
             13 LOAD_FAST                0 (x)
             16 LOAD_GLOBAL              0 (seq)
             19 COMPARE_OP               6 (in)
             22 LIST_APPEND              2
             25 JUMP_ABSOLUTE            7
        >>   28 POP_TOP             
             29 LOAD_CONST               0 (None)
             32 RETURN_VALUE        

Note how the bytecode contains a FOR_ITER loop. On the other hand, in python3 list-comprehension are actually functions with their own stack frame:

>>> def test(): [x in seq2 for x in seq]
... 
>>> dis.dis(test)
  1           0 LOAD_CONST               1 (<code object <listcomp> at 0xb6fef160, file "<stdin>", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (seq) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 POP_TOP              
             14 LOAD_CONST               0 (None) 
             17 RETURN_VALUE      

As you can see there is no FOR_ITER here, instead there is a MAKE_FUNCTION and CALL_FUNCTION bytecodes. If we examine the code of the list-comprehension we can understand how the bindings are setup:

>>> test.__code__.co_consts[1]
<code object <listcomp> at 0xb6fef160, file "<stdin>", line 1>
>>> test.__code__.co_consts[1].co_argcount   # it has one argument
1
>>> test.__code__.co_consts[1].co_names      # global variables
('seq2',)
>>> test.__code__.co_consts[1].co_varnames   # local variables
('.0', 'x')

Here .0 is the only argument of the function. x is the local variable of the loop and seq2 is a global variable. Note that .0, the list-comprehension argument, is the iterable obtained from seq, not seq itself. (see the GET_ITER opcode in the output of dis above). This is more clear with a more complex example:

>>> def test():
...     [x in seq for x in zip(seq, a)]
... 
>>> dis.dis(test)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0xb7196f70, file "<stdin>", line 2>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (zip) 
              9 LOAD_GLOBAL              1 (seq) 
             12 LOAD_GLOBAL              2 (a) 
             15 CALL_FUNCTION            2 
             18 GET_ITER             
             19 CALL_FUNCTION            1 
             22 POP_TOP              
             23 LOAD_CONST               0 (None) 
             26 RETURN_VALUE 
>>> test.__code__.co_consts[1].co_varnames
('.0', 'x')

Here you can see that the only argument to the list-comprehension, always denoted by .0, is the iterable obtained from zip(seq, a). seq and a themselves are not passed to the list-comprehension. Only iter(zip(seq, a)) is passed inside the list-comprehension.

An other observation that we must make is that, when you run pdb, you cannot access the context of the current function from the functions you want to define. For example the following code fails both on python2 and python3:

>>> import pdb
>>> def test(seq): pdb.set_trace()
... 
>>> test([1,2,3])
--Return--
> <stdin>(1)test()->None
(Pdb) def test2(): print(seq)
(Pdb) test2()
*** NameError: global name 'seq' is not defined

It fails because when defining test2 the seq variable is treated as a global variable, but it's actually a local variable inside the test function, hence it isn't accessible.

The behaviour you see is similar to the following scenario:

#python 2 no error
>>> class A(object):
...     x = 1
...     L = [x for _ in range(3)]
... 
>>> 

#python3 error!
>>> class A(object):
...     x = 1
...     L = [x for _ in range(3)]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in A
  File "<stdin>", line 3, in <listcomp>
NameError: global name 'x' is not defined

The first one doesn't give an error because it is mostly equivalent to:

>>> class A(object):
...     x = 1
...     L = []
...     for _ in range(3): L.append(x)
... 

Since the list-comprehension is "expanded" in the bytecode. In python3 it fails because you are actually defining a function and you cannot access the class scope from a nested function scope:

>>> class A(object):
...     x = 1
...     def test():
...             print(x)
...     test()
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in A
  File "<stdin>", line 4, in test
NameError: global name 'x' is not defined

Note that genexp are implemented as functions on python2, and in fact you see a similar behaviour with them(both on python2 and python3):

>>> import pdb
>>> def test(seq): pdb.set_trace()
... 
>>> test([1,2,3])
--Return--
> <stdin>(1)test()->None
(Pdb) list(x in seq for x in seq)
*** Error in argument: '(x in seq for x in seq)'

Here pdb doesn't give you more details, but the failure happens for the same exact reason.


In conclusion: it's not a bug in pdb but the way python implements scopes. AFAIK changing this to allow what you are trying to do in pdb would require some big changes in how functions are treated and I don't know whether this can be done without modifying the interpreter.


Note that when using nested list-comprehensions, the nested loop is expanded in bytecode like the list-comprehensions in python2:

>>> import dis
>>> def test(): [x + y for x in seq1 for y in seq2]
... 
>>> dis.dis(test)
  1           0 LOAD_CONST               1 (<code object <listcomp> at 0xb71bf5c0, file "<stdin>", line 1>) 
              3 MAKE_FUNCTION            0 
              6 LOAD_GLOBAL              0 (seq1) 
              9 GET_ITER             
             10 CALL_FUNCTION            1 
             13 POP_TOP              
             14 LOAD_CONST               0 (None) 
             17 RETURN_VALUE         
>>> # The only argument to the listcomp is seq1
>>> import types
>>> func = types.FunctionType(test.__code__.co_consts[1], globals())
>>> dis.dis(func)
  1           0 BUILD_LIST               0 
              3 LOAD_FAST                0 (.0) 
        >>    6 FOR_ITER                29 (to 38) 
              9 STORE_FAST               1 (x) 
             12 LOAD_GLOBAL              0 (seq2) 
             15 GET_ITER             
        >>   16 FOR_ITER                16 (to 35) 
             19 STORE_FAST               2 (y) 
             22 LOAD_FAST                1 (x) 
             25 LOAD_FAST                2 (y) 
             28 BINARY_ADD           
             29 LIST_APPEND              3 
             32 JUMP_ABSOLUTE           16 
        >>   35 JUMP_ABSOLUTE            6 
        >>   38 RETURN_VALUE        

As you can see, the bytecode for listcomp has an explicit FOR_ITER over seq2. This explicit FOR_ITER is inside the listcomp function, and thus the restrictions on scopes still apply(e.g. seq2 is loaded as a global).

And in fact we can confirm this using pdb:

>>> import pdb
>>> def test(seq1, seq2): pdb.set_trace()
... 
>>> test([1,2,3], [4,5,6])
--Return--
> <stdin>(1)test()->None
(Pdb) [x + y for x in seq1 for y in seq2]
*** NameError: global name 'seq2' is not defined
(Pdb) [x + y for x in non_existent for y in seq2]
*** NameError: name 'non_existent' is not defined

Note how the NameError is about seq2 and not seq1(which is passed as function argument), and note how changing the first iterable name to something that doesn't exist changes the NameError(which means that in the first case seq1 was passed successfully).

I just can't understand why you would need to do the above if you are looking to produce a list of Trues for each element in seq then why not [True for x in seq] - I would guess that you need to assign a local copy first before trying this sort of thing.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!