问题
Inspired by this nice answer,
Here's a benchmark:
import timeit
def test1():
a = [1,2,3]
a.insert(0,1)
def test2():
a = [1,2,3]
a[0:0]=[1]
print (timeit.timeit('test1()','from __main__ import test1'))
print (timeit.timeit('test2()','from __main__ import test2'))
For me, test2
is sligtly faster (~10%). Why is that the case? I would expect it to be slower since:
- slice assignment must be able to accept iterables of any length and therefore must be more general.
- in slice assignment, we need to create a new list on the right hand side just to get it to work.
Can anybody help me understand this?
(using python 2.7 on OS-X 10.5.8)
回答1:
Your first test case has to call the method insert
on the list a
, whereas all the operations in test2
are handled directly in byte code. Note the CALL_FUNCTION
in the disassembly of test1
below. Calling functions is moderately expensive in Python: certainly expensive enough to account for a few percent difference in run time.
>>> import dis
>>> dis.dis(test1)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a)
3 15 LOAD_FAST 0 (a)
18 LOAD_ATTR 0 (insert)
21 LOAD_CONST 4 (0)
24 LOAD_CONST 1 (1)
27 CALL_FUNCTION 2
30 POP_TOP
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
>>> dis.dis(test2)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 LOAD_CONST 3 (3)
9 BUILD_LIST 3
12 STORE_FAST 0 (a)
3 15 LOAD_CONST 1 (1)
18 BUILD_LIST 1
21 LOAD_FAST 0 (a)
24 LOAD_CONST 4 (0)
27 LOAD_CONST 4 (0)
30 STORE_SLICE+3
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
Bad explanation
I posted this first, but after consideration I think it is not correct. The difference I describe here should only make a noticeable difference when there is a lot of data to be moved, which isn't the case in the test here. And even with a lot of data the difference is only a couple of percent:
import timeit
def test1():
a = range(10000000)
a.insert(1,1)
def test2():
a = range(10000000)
a[1:1]=[1]
>>> timeit.timeit(test1, number=10)
6.008707046508789
>>> timeit.timeit(test2, number=10)
5.861173868179321
The method list.insert
is implemented by the function ins1 in listobject.c
. You'll see that it copies the item references for the tail of the list one by one:
for (i = n; --i >= where; )
items[i+1] = items[i];
On the other hand slice assignment is implemented by the function list_ass_slice, which calls memmove
:
memmove(&item[ihigh+d], &item[ihigh],
(k - ihigh)*sizeof(PyObject *));
So I think the answer to your question is that the C library function memmove
is better optimized than the simple loop. See here for the glibc implementation of memmove: I believe that when called from list_ass_slice
it eventually ends up calling _wordcopy_bwd_aligned which you can see is heavily hand-optimized.
来源:https://stackoverflow.com/questions/12537716/why-is-slice-assignment-faster-than-list-insert