sorted() using generator expressions rather than lists

前端 未结 8 1542
梦如初夏
梦如初夏 2020-11-30 05:31

After seeing the discussion here: Python - generate the time difference I got curious. I also initially thought that a generator is faster than a list, but when it comes to

相关标签:
8条回答
  • 2020-11-30 06:13

    The first thing sorted() does is to convert the data to a list. Basically the first line (after argument validation) of the implementation is

    newlist = PySequence_List(seq);
    

    See also the full source code version 2.7 and version 3.1.2.

    Edit: As pointed out in the answer by aaronasterling, the variable newlist is, well, a new list. If the parameter is already a list, it is copied. So a generator expression really has the advantage of using less memory.

    0 讨论(0)
  • 2020-11-30 06:13

    The easiest way to see which is faster is to use timeit and it tells me that it's faster to pass a list rather than a generator:

    >>> import random
    >>> randomlist = range(1000)
    >>> random.shuffle(randomlist)
    >>> import timeit
    >>> timeit.timeit("sorted(x for x in randomlist)",setup = "from __main__ import randomlist",number = 10000)
    4.944492386602178
    >>> timeit.timeit("sorted([x for x in randomlist])",setup = "from __main__ import randomlist",number = 10000)
    4.635165083830486
    

    And:

    >>> timeit.timeit("sorted(x for x in xrange(1000,1,-1))",number = 10000)
    1.411807087213674
    >>> timeit.timeit("sorted([x for x in xrange(1000,1,-1)])",number = 10000)
    1.0734657617099401
    

    I think this is because when sorted() converts the incoming value to a list it can do this more quickly for something that is already a list than for a generator. The source code seems to confirm this (but this is from reading the comments rather than fully understanding everything that is going on).

    0 讨论(0)
提交回复
热议问题