Python is 'key in dict' different/faster than 'key in dict.keys()' [duplicate]

后端未结

关注

 1  1179

醉话见心

相关标签:

1条回答

南旧

2021-01-05 07:09
Short answer:
- In python 2: your assumption is correct: dict.keys() slows down.
- In python 3: your assumption is not correct: in dict.keys() performs like like in dict
Details for py2 and py3 follow below.

Python 2.7 answer:

Your assumption is correct, for the following reasons:
- dict.keys() involves an additional function call (stack overhead).
- dict.keys() returns a list, which contains all keys in memory (as opposed to a lazy generator object). Hence it needs to allocate memory.
- key in dict can use a set object internally, which is an indexed lookup. key in dict.keys() is a linear search in a list
I created a small benckmark script to show my point:
```
#! /usr/bin/python2.7

import datetime as dt
import random

dict_size = 1000000
num_iterations = 100

d = {i: i for i in xrange(dict_size)}

def f():
    k = random.randint(0,dict_size-1)
    return (k in d)

def g():
    k = random.randint(0,dict_size-1)
    return (k in d.keys())

def test(func):
    t = dt.datetime.utcnow()
    for i in xrange(num_iterations):
        func()
    print "%s --> %1.6f s" % (func, (dt.datetime.utcnow()-t).total_seconds())

test(f)
test(g)
```
Output (python 2.7.6 Ubuntu 14.04):
```
<function f at 0x7ff2e0126d70> --> 0.000598 s
<function g at 0x7ff2e0126de8> --> 5.191553 s
```
I also tested with the nr of iterations and the dict size swapped (dict of only 100 items, 1M iterations).
```
<function f at 0x7f94cb5e6d70> --> 3.614162 s
<function g at 0x7f94cb5e6de8> --> 7.007922 s
```
Here the results are much closer together.

So the performance difference really grows with the size of the dict.

Python 3 answer:

I adapted the script for python 3:
- xrange is gone, use range instead. (not in the performance-critical inner loop of the test function, so limited performance impact)
- use braces with print
- change shabang line to #!/usr/bin/python3
And tested with python 3.4.3 on the same machine.
- dict_size = 1000000; num_iterations = 100
  
  f --> 0.000590 s g --> 0.000565 s
- dict_size = 100; num_iterations = 1000000
  
  f --> 4.525487 s g --> 4.837232 s
So in python 3, the performance difference is gone.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题