Efficient ways to duplicate array/list in Python

前端 未结 2 1128
有刺的猬
有刺的猬 2021-01-12 05:40

Note: I\'m a Ruby developer trying to find my way in Python.

When I wanted to figure out why some scripts use mylist[:] instead of list(mylist)

相关标签:
2条回答
  • 2021-01-12 06:00

    I can't comment on the ruby timing vs. the python timing. But I can comment on list vs. slice. Here's a quick inspection of the bytecode:

    >>> import dis
    >>> a = range(10)
    >>> def func(a):
    ...     return a[:]
    ... 
    >>> def func2(a):
    ...     return list(a)
    ... 
    >>> dis.dis(func)
      2           0 LOAD_FAST                0 (a)
                  3 SLICE+0             
                  4 RETURN_VALUE        
    >>> dis.dis(func2)
      2           0 LOAD_GLOBAL              0 (list)
                  3 LOAD_FAST                0 (a)
                  6 CALL_FUNCTION            1
                  9 RETURN_VALUE 
    

    Notice that list requires a LOAD_GLOBAL to find the function list. Looking up globals (and calling functions) in python is relatively slow. This would explain why a[0:len(a)] is also slower. Also remember that list needs to be able to handle arbitrary iterators whereas slicing doesn't. This means that list needs to allocate a new list, pack elements into that list as it iterates over the list and resize when necessary. There are a few things in here which are expensive -- resizing if necessary and iterating (effectively in python, not C). With the slicing method, you can calculate the size of the memory you'll need so can probably avoid resizing, and the iteration can be done completely in C (probably with a memcpy or something.

    disclaimer : I'm not a python dev, so I don't know how the internals of list() are implemented for sure. I'm just speculating based what I know of the specification.

    EDIT -- So I've looked at the source (with a little guidance from Martijn). The relevant code is in listobject.c. list calls list_init which then calls listextend at line 799. That function has some checks to see if it can use a fast branch if the object is a list or a tuple (line 812). Finally, the heavy lifting is done starting at line 834:

     src = PySequence_Fast_ITEMS(b);
     dest = self->ob_item + m;
     for (i = 0; i < n; i++) {
         PyObject *o = src[i];
         Py_INCREF(o);
         dest[i] = o;
     }
    

    Compare that to the slice version which I think is defined in list_subscript (line 2544). That calls list_slice (line 2570) where the heavy lifting is done by the following loop (line 486):

     src = a->ob_item + ilow;
     dest = np->ob_item;
     for (i = 0; i < len; i++) {
         PyObject *v = src[i];
         Py_INCREF(v);
         dest[i] = v;
     }
    

    They're pretty much the same code, so it's not surprising that the performance is almost the same for large lists (where the overhead of the small stuff like unpacking slices, looking up global variables, etc becomes less important)


    Here's how I would run the python tests (and the results for my Ubuntu system):

    $ python -m timeit -s 'a=range(30)' 'list(a)'
    1000000 loops, best of 3: 0.39 usec per loop
    $ python -m timeit -s 'a=range(30)' 'a[:]'
    10000000 loops, best of 3: 0.183 usec per loop
    $ python -m timeit -s 'a=range(30)' 'a[0:len(a)]'
    1000000 loops, best of 3: 0.254 usec per loop
    
    0 讨论(0)
  • 2021-01-12 06:18

    Use the timeit module in python for testing timings.

    from copy import *
    
    a=range(1000)
    
    def cop():
        b=copy(a)
    
    def func1():
        b=list(a)
    
    def slice():
        b=a[:]
    
    def slice_len():
        b=a[0:len(a)]
    
    
    
    if __name__=="__main__":
        import timeit
        print "copy(a)",timeit.timeit("cop()", setup="from __main__ import cop")
        print "list(a)",timeit.timeit("func1()", setup="from __main__ import func1")
        print "a[:]",timeit.timeit("slice()", setup="from __main__ import slice")
        print "a[0:len(a)]",timeit.timeit("slice_len()", setup="from __main__ import slice_len")
    

    Results:

    copy(a) 3.98940896988
    list(a) 2.54542589188
    a[:] 1.96630120277                   #winner
    a[0:len(a)] 10.5431251526
    

    It's surely the extra steps involved in a[0:len(a)] are the reason for it's slowness.

    Here's the byte code comparison of the two:

    In [19]: dis.dis(func1)
      2           0 LOAD_GLOBAL              0 (range)
                  3 LOAD_CONST               1 (100000)
                  6 CALL_FUNCTION            1
                  9 STORE_FAST               0 (a)
    
      3          12 LOAD_FAST                0 (a)
                 15 SLICE+0             
                 16 STORE_FAST               1 (b)
                 19 LOAD_CONST               0 (None)
                 22 RETURN_VALUE        
    
    In [20]: dis.dis(func2)
      2           0 LOAD_GLOBAL              0 (range)
                  3 LOAD_CONST               1 (100000)
                  6 CALL_FUNCTION            1
                  9 STORE_FAST               0 (a)
    
      3          12 LOAD_FAST                0 (a)    #same up to here
                 15 LOAD_CONST               2 (0)    #loads 0
                 18 LOAD_GLOBAL              1 (len) # loads the builtin len(),
                                                     # so it might take some lookup time
                 21 LOAD_FAST                0 (a)
                 24 CALL_FUNCTION            1         
                 27 SLICE+3             
                 28 STORE_FAST               1 (b)
                 31 LOAD_CONST               0 (None)
                 34 RETURN_VALUE        
    
    0 讨论(0)
提交回复
热议问题