Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3?

前端 未结 4 462
有刺的猬
有刺的猬 2020-12-06 03:54

In Python 3, is a list comprehension simply syntactic sugar for a generator expression fed into the list function?

e.g. is the following code:



        
相关标签:
4条回答
  • 2020-12-06 04:23

    Both work differently. The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.

    >>> def func_lc():
        [x**2 for x in y]
    ...
    >>> dis.dis(func_lc)
      2           0 LOAD_CONST               1 (<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>)
                  3 LOAD_CONST               2 ('func_lc.<locals>.<listcomp>')
                  6 MAKE_FUNCTION            0
                  9 LOAD_GLOBAL              0 (y)
                 12 GET_ITER
                 13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
                 16 POP_TOP
                 17 LOAD_CONST               0 (None)
                 20 RETURN_VALUE
    
    >>> lc_object = list(dis.get_instructions(func_lc))[0].argval
    >>> lc_object
    <code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>
    >>> dis.dis(lc_object)
      2           0 BUILD_LIST               0
                  3 LOAD_FAST                0 (.0)
            >>    6 FOR_ITER                16 (to 25)
                  9 STORE_FAST               1 (x)
                 12 LOAD_FAST                1 (x)
                 15 LOAD_CONST               0 (2)
                 18 BINARY_POWER
                 19 LIST_APPEND              2
                 22 JUMP_ABSOLUTE            6
            >>   25 RETURN_VALUE
    

    On the other hand the list() version simply passes the generator object to list's __init__ method which then calls its extend method internally. As the object is not a list or tuple, CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:

    >>> def func_ge():
        list(x**2 for x in y)
    ...
    >>> dis.dis(func_ge)
      2           0 LOAD_GLOBAL              0 (list)
                  3 LOAD_CONST               1 (<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>)
                  6 LOAD_CONST               2 ('func_ge.<locals>.<genexpr>')
                  9 MAKE_FUNCTION            0
                 12 LOAD_GLOBAL              1 (y)
                 15 GET_ITER
                 16 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
                 19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
                 22 POP_TOP
                 23 LOAD_CONST               0 (None)
                 26 RETURN_VALUE
    >>> ge_object = list(dis.get_instructions(func_ge))[1].argval
    >>> ge_object
    <code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>
    >>> dis.dis(ge_object)
      2           0 LOAD_FAST                0 (.0)
            >>    3 FOR_ITER                15 (to 21)
                  6 STORE_FAST               1 (x)
                  9 LOAD_FAST                1 (x)
                 12 LOAD_CONST               0 (2)
                 15 BINARY_POWER
                 16 YIELD_VALUE
                 17 POP_TOP
                 18 JUMP_ABSOLUTE            3
            >>   21 LOAD_CONST               1 (None)
                 24 RETURN_VALUE
    >>>
    

    Timing comparisons:

    >>> %timeit [x**2 for x in range(10**6)]
    1 loops, best of 3: 453 ms per loop
    >>> %timeit list(x**2 for x in range(10**6))
    1 loops, best of 3: 478 ms per loop
    >>> %%timeit
    out = []
    for x in range(10**6):
        out.append(x**2)
    ...
    1 loops, best of 3: 510 ms per loop
    

    Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.

    >>> %%timeit
    out = [];append=out.append
    for x in range(10**6):
        append(x**2)
    ...
    1 loops, best of 3: 467 ms per loop
    

    Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:

    >>> [x**2 for x in 1, 2, 3] # Python 2
    [1, 4, 9]
    >>> [x**2 for x in 1, 2, 3] # Python 3
      File "<ipython-input-69-bea9540dd1d6>", line 1
        [x**2 for x in 1, 2, 3]
                        ^
    SyntaxError: invalid syntax
    
    >>> [x**2 for x in (1, 2, 3)] # Add parenthesis
    [1, 4, 9]
    >>> for x in 1, 2, 3: # Python 3: For normal loops it still works
        print(x**2)
    ...
    1
    4
    9
    
    0 讨论(0)
  • 2020-12-06 04:23

    You can actually show that the two can have different outcomes to prove they are inherently different:

    >>> list(next(iter([])) if x > 3 else x for x in range(10))
    [0, 1, 2, 3]
    
    >>> [next(iter([])) if x > 3 else x for x in range(10)]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in <listcomp>
    StopIteration
    

    The expression inside the comprehension is not treated as a generator since the comprehension does not handle the StopIteration, whereas the list constructor does.

    0 讨论(0)
  • 2020-12-06 04:39

    They aren't the same, list() will evaluate what ever is given to it after what is in the parentheses has finished executing, not before.

    The [] in python is a bit magical, it tells python to wrap what ever is inside it as a list, more like a type hint for the language.

    0 讨论(0)
  • 2020-12-06 04:45

    Both forms create and call an anonymous function. However, the list(...) form creates a generator function and passes the returned generator-iterator to list, while with the [...] form, the anonymous function builds the list directly with LIST_APPEND opcodes.

    The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-list:

    import dis
    
    def f():
        [x for x in []]
    
    def g():
        list(x for x in [])
    
    dis.dis(f.__code__.co_consts[1])
    dis.dis(g.__code__.co_consts[1])
    

    The output for the comprehension is

      4           0 BUILD_LIST               0
                  3 LOAD_FAST                0 (.0)
            >>    6 FOR_ITER                12 (to 21)
                  9 STORE_FAST               1 (x)
                 12 LOAD_FAST                1 (x)
                 15 LIST_APPEND              2
                 18 JUMP_ABSOLUTE            6
            >>   21 RETURN_VALUE
    

    The output for the genexp is

      7           0 LOAD_FAST                0 (.0)
            >>    3 FOR_ITER                11 (to 17)
                  6 STORE_FAST               1 (x)
                  9 LOAD_FAST                1 (x)
                 12 YIELD_VALUE
                 13 POP_TOP
                 14 JUMP_ABSOLUTE            3
            >>   17 LOAD_CONST               0 (None)
                 20 RETURN_VALUE
    
    0 讨论(0)
提交回复
热议问题