List indexing efficiency (python 2 vs python 3)

不打扰是莪最后的温柔 提交于 2019-12-03 22:10:32

This appears to be an artifact of some builds of Python 3.2. The best hypothesis at this point is that all 32-bit Intel builds have the slowdown, but no 64-bit ones do. Read on for further details.

You didn't run nearly enough tests to determine anything. Repeating your test a bunch of times, I got values ranging from 0.31 to 0.54 for the same test, which is a huge variation.

So, I ran your test with 10x the number, and repeat=10, using a bunch of different Python2 and Python3 installs. Throwing away the top and bottom results, averaging the other 8, and dividing by 10 (to get a number equivalent to your tests), here's what I saw:

 1. 0.52/0.53 Lion 2.6
 2. 0.49/0.50 Lion 2.7
 3. 0.48/0.48 MacPorts 2.7
 4. 0.39/0.49 MacPorts 3.2
 5. 0.39/0.48 HomeBrew 3.2

So, it looks like 3.2 is actually slightly faster with [99], and about the same speed with [-1].

However, on a 10.5 machine, I got these results:

 1. 0.98/1.02 MacPorts 2.6
 2. 1.47/1.59 MacPorts 3.2

Back on the original (Lion) machine, I ran in 32-bit mode, and got this:

 1. 0.50/0.48 Homebrew 2.7
 2. 0.75/0.82 Homebrew 3.2

So, it seems like 32-bitness is what matters, and not Leopard vs. Lion, gcc 4.0 vs. gcc 4.2 or clang, hardware differences, etc. It would help to test 64-bit builds under Leopard, with different compilers, etc., but unfortunately my Leopard box is a first-gen Intel Mini (with a 32-bit Core Solo CPU), so I can't do that test.

As further circumstantial evidence, I ran a whole slew of other quick tests on the Lion box, and it looks like 32-bit 3.2 is ~50% slower than 2.x, while 64-bit 3.2 is maybe a little faster than 2.x. But if we really want to back that up, someone needs to pick and run a real benchmark suite.

Anyway, my best guess at this point is that when optimizing the 3.x branch, nobody put much effort into 32-bit i386 Mac builds. Which is actually a reasonable choice for them to have made.

Or, alternatively, they didn't even put much effort into 32-bit i386 period. That possibility might explain why the OP saw 2.x and 3.2 giving similar results on a linux box, while Otto Allmendinger saw 3.2 being similarly slower to 2.6 on a linux box. But since neither of them mentioned whether they were running 32-bit or 64-bit linux, it's hard to know whether that's relevant.

There are still lots of other different possibilities that we haven't ruled out, but this seems like the best one.

here is a code that illustrates at least part of the answer:

$ python
Python 2.7.3 (default, Apr 20 2012, 22:44:07) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> t=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=50000000)
>>> print (t)
2.55517697334
>>> t=timeit.timeit('mylist[99L]',setup='mylist=list(range(100))',number=50000000)
>>> print (t)
3.89904499054

$ python3
Python 3.2.3 (default, May  3 2012, 15:54:42) 
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import timeit
>>> t=timeit.timeit('mylist[99]',setup='mylist=list(range(100))',number=50000000)
>>> print (t)
3.9906489849090576

python3 does not have old int type.

Python 3 range() is the Python 2 xrange(). If you want to simulate the Python 2 range() in Python 3 code, you have to use list(range(num). The bigger the num is, the bigger difference will be observed with your original code.

Indexing should be independent on what is stored inside the list as the list stores only references to the target objects. The references are untyped and all of the same kind. The list type is therefore a homogeneous data structure -- technically. Indexing means to turn the index value into the start address + offset. Calculating the offset is very efficient with at most one subtraction. This is very cheap extra operation when compared with the other operations.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!