Why are __getitem__(key) and get(key) significantly slower than [key]?

蓝咒 提交于 2019-12-09 09:18:44

问题


It was my understanding that brackets were nothing more than a wrapper for __getitem__. Here is how I benchmarked this:

First, I generated a semi-large dictionary.

items = {}
for i in range(1000000):
    items[i] = 1

Then, I used cProfile to test the following three functions:

def get2(items):
    for k in items.iterkeys():
        items.get(k)

def magic3(items):
    for k in items.iterkeys():
        items.__getitem__(k)

def brackets1(items):
    for k in items.iterkeys():
        items[k]

The results looked like so:

         1000004 function calls in 3.779 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.779    3.779 <string>:1(<module>)
        1    2.135    2.135    3.778    3.778 dict_get_items.py:15(get2)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
  1000000    1.644    0.000    1.644    0.000 {method 'get' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'iterkeys' of 'dict' objects}


         1000004 function calls in 3.679 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.679    3.679 <string>:1(<module>)
        1    2.083    2.083    3.679    3.679 dict_get_items.py:19(magic3)
  1000000    1.596    0.000    1.596    0.000 {method '__getitem__' of 'dict' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'iterkeys' of 'dict' objects}


         4 function calls in 0.136 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.136    0.136 <string>:1(<module>)
        1    0.136    0.136    0.136    0.136 dict_get_items.py:11(brackets1)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'iterkeys' of 'dict' objects}

Is the issue in the way I am benchmarking? I tried replacing the bracket access with a simple "pass" to ensure that the data was actually being accessed, and found that "pass" was running much quicker. My interpretation of this was that the data was indeed being accessed. I've also tried appending to a new list, which gave similar results.


回答1:


First, the disassembly posted by Not_a_Golfer:

>>> d = {1:2}
>>> dis.dis(lambda: d[1])
  1           0 LOAD_GLOBAL              0 (d)
              3 LOAD_CONST               1 (1)
              6 BINARY_SUBSCR       
              7 RETURN_VALUE   

>>> dis.dis(lambda: d.get(1))
  1           0 LOAD_GLOBAL              0 (d)
              3 LOAD_ATTR                1 (get)
              6 LOAD_CONST               1 (1)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE  

>>> dis.dis(lambda: d.__getitem__(1))
  1           0 LOAD_GLOBAL              0 (d)
              3 LOAD_ATTR                1 (__getitem__)
              6 LOAD_CONST               1 (1)
              9 CALL_FUNCTION            1
             12 RETURN_VALUE

Now, getting the benchmarking right is obviously important to read anything into the results, and I don't know enough to help much there. But assuming there really is a difference (which makes sense to me), here's my guesses about why there is:

  1. dict.get simply "does more"; it has to check if the key is present, and if not return its second argument (which defaults to None). This means there's some form of conditional or exception-catching, so I am completely unsurprised that this would have different timing characteristics to the more basic operation of retrieving the value associated with a key.

  2. Python has a specific bytecode for the "subscription" operation (as demonstrated in the disassembly). The builtin types, including dict, are implemented primarily in C and their implementations do not necessarily play by the normal Python rules (only their interfaces are required to, and there are plenty of corner cases even there). So my guess would be that the implementation of the BINARY_SUBSCR opcode goes more-or-less directly to the underlying C implementations of builtin types that support this operation. For these types, I expect that it is actually __getitem__ that exists as a Python-level method to wrap the C implementation, rather than that the bracket syntax invokes the Python-level method.

It might be interesting to benchmark thing.__getitem__(key) against thing[key] for an instance of a custom class that implements __getitem__; you might actually see the opposite results there as the BINARY_SUBSCR op-code would internally have to fall back to doing equivalent work to looking up the method and invoking it.



来源:https://stackoverflow.com/questions/10823632/why-are-getitem-key-and-getkey-significantly-slower-than-key

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!