Is it possible to give a python dict an initial capacity (and is it useful)

*爱你&永不变心* 提交于 2019-11-30 17:52:35

First off, I've heard rumor that you can set the size of a dictionary at initialization, but I have never seen any documentation or PEP describing how this would be done.

With this in mind I ran an analysis on your quantity of items, described below. While it may take some time to resize the dictionary each time I would recommend moving ahead without worrying about it, at least until you can test its performance.

The two rules that concern us in determining resizing is number of elements and factor of resizing. A dictionary will resize itself when it is 2/3 full on the addition of the element putting it over the 2/3 mark. Below 50,000 elements it will increase by a factor of 4, above that amount by a factor of 2. Using your estimate of 10,000,000 elements (between 2^23 and 2^24) your dictionary will resize itself 15 times (7 times below 50k, 8 times above). Another resize would occur just past 11,100,000.

Resizing and replacing the current elements in the hashtable does take some time, but I wonder if you'd notice it with whatever else you have going on in the code nearby. I just put together a timing suite comparing inserts at five places along each boundary from dictionary sizes of 2^3 through 2^24, and the "border" additions average 0.4 nanoseconds longer than the "non-border" additions. This is 0.17% longer... probably acceptable. The minimum for all operations was 0.2085 microseconds, and max was 0.2412 microseconds.

Hope this is insightful, and if you do check the performance of your code please follow-up with an edit! My primary resource for dictionary internals was the splendid talk given by Brandon Rhodes at PyCon 2010: The Mighty Dictionary

Yes you can and here is a solution I found in another person's question that is related to yours too:

d = {}
for i in xrange(4000000):
d[i] = None
# 722ms

d = dict(itertools.izip(xrange(4000000), itertools.repeat(None)))
# 634ms

dict.fromkeys(xrange(4000000))
# 558ms

s = set(xrange(4000000))
dict.fromkeys(s)
# Not including set construction 353ms

those are different ways to initialize a dictionary with a certain size.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!