it seems a simple task:
I am trying to merge 2 dictionaries without overwriting the values but APPENDING.
a = {1: [(1,1)],2: [(2,2),(3,3)],3: [(4,4)]}
If you want a third dictionary that is the combined one I would use the collection.defaultdict
from collections import defaultdict
from itertools import chain
all = defaultdict(list)
for k,v in chain(a.iteritems(), b.iteritems()):
all[k].extend(v)
outputs
defaultdict(<type 'list'>, {1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]})
As an explanation of why your a
changes, consider your loop:
for k in a.keys():
if k in all:
all[k].append(a[k])
else:
all[k] = a[k]
So, if k
is not yet in all
, you enter the else
part and now, all[k]
points to the a[k]
list. It's not a copy, it's a reference to a[k]
: they're basically the same object. At the next iteration, all[k]
is defined, and you append to it: but as all[k]
points to a[k]
, you end up also appending to a[k]
.
You want to avoid a all[k] = a[k]
. You could try that:
for k in a.keys():
if k not in all:
all[k] = []
all[k].extend(a[k])
(Note the extend
instead of the append
, as pointed out by @Martijn Pieters). Here, you never have all[k]
pointing to a[k]
, so you're safe. @Martijn Pieters' answer is far more concise and elegant, though, so you should go with it.
Use .extend instead of .append
for merging lists together.
>>> example = [1, 2, 3]
>>> example.append([4, 5])
>>> example
[1, 2, 3, [4, 5]]
>>> example.extend([6, 7])
>>> example
[1, 2, 3, [4, 5], 6, 7]
Moreover, you can loop over the keys and values of both a
and b
together using itertools.chain:
from itertools import chain
all = {}
for k, v in chain(a.iteritems(), b.iteritems()):
all.setdefault(k, []).extend(v)
.setdefault() looks up a key, and sets it to a default if it is not yet there. Alternatively you could use collections.defaultdict to do the same implicitly.
outputs:
>>> a
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4)]}
>>> b
{3: [(5,5)], 4: [(6,6)]}
>>> all
{1: [(1, 1)], 2: [(2, 2), (3, 3)], 3: [(4, 4), (5, 5)], 4: [(6, 6)]}
Note that because we now create a clean new list for each key first, then extend, your original lists in a
are unaffected. In your code you do not create a copy of the list; instead you copied the reference to the list. In the end both the all
and the a
dict values point to the same lists, and using append on those lists results in the changes being visible in both places.
It's easy to demonstrate that with simple variables instead of a dict:
>>> foo = [1, 2, 3]
>>> bar = foo
>>> bar
[1, 2, 3]
>>> bar.append(4)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4])
>>> id(foo), id(bar)
(4477098392, 4477098392)
Both foo
and bar
refer to the same list, the list was not copied. To create a copy instead, use the list()
constructor or use the [:]
slice operator:
>>> bar = foo[:]
>>> bar.append(5)
>>> foo, bar
([1, 2, 3, 4], [1, 2, 3, 4, 5])
>>> id(foo), id(bar)
(4477098392, 4477098536)
Now bar
is a new copy of the list and changes no longer are visible in foo
. The memory addresses (the result of the id()
call) differ for the two lists.