问题
I have a list of dictionaries that I would like to combine into one dictionary and add the values from each dictionary in the list. For example:
ds = [{1: 1, 2: 0, 3: 0}, {1: 2, 2: 1, 3: 0}, {1: 3, 2: 2, 3: 1, 4: 5}]
The final results should be a single dictionary:
merged = {1: 6, 2: 3, 3: 1, 4: 5}
I'm interested in performance and am looking for the fastest implementation that can merge a list of n-dictionaries into one dictionary and sum the values. An obvious implementation is:
from collections import defaultdict
merged = defaultdict(int)
for d in ds:
for k, v in d.items():
merged[k] += v
Is there a faster way to do this in Python 2.6?
回答1:
defaultdict
is still fastest, I found a few ways to speed it up by caching function names and now just found another way that sped it up significantly, by just iterating for k in d
instead of using d.items()
or d.iteritems()
Some timings so far:
from random import randrange
ds = [dict((randrange(1, 1000), randrange(1, 1000)) for i in xrange(500))
for i in xrange(10000)]
# 10000 dictionaries of approx. length 500
from collections import defaultdict
def merge1(dicts, defaultdict=defaultdict, int=int):
merged = defaultdict(int)
for d in dicts:
for k in d:
merged[k] += d[k]
return merged
def merge2(dicts):
merged = {}
merged_get = merged.get
for d in dicts:
for k in d:
merged[k] = merged_get(k, 0) + d[k]
return merged
def merge3(dicts):
merged = {}
for d in dicts:
for k in d:
merged[k] = merged[k] + d[k] if k in merged else 0
return merged
from timeit import timeit
for func in ('merge1', 'merge2', 'merge3'):
print func, timeit(stmt='{0}(ds)'.format(func),
setup='from __main__ import merge1, merge2, merge3, ds',
number=1)
merge1 0.992541510164
merge2 1.40478747997
merge3 1.23502204889
来源:https://stackoverflow.com/questions/16169723/fastest-way-to-merge-n-dictionaries-and-add-values-on-2-6