The its-late-and-im-probably-stupid department presents:
>>> import multiprocessing
>>> mgr = multiprocessing.Manager()
>>> d = mg
This is some pretty interesting behavior, I am not exactly sure how it works but I'll take a crack at why the behavior is the way it is.
First, note that multiprocessing.Manager().dict()
is not a dict
, it is a DictProxy
object:
>>> d = multiprocessing.Manager().dict()
>>> d
<DictProxy object, typeid 'dict' at 0x7fa2bbe8ea50>
The purpose of the DictProxy
class is to give you a dict
that is safe to share across processes, which means that it must implement some locking on top of the normal dict
functions.
Apparently part of the implementation here is to not allow you to directly access mutable objects nested inside of a DictProxy
, because if that was allowed you would be able to modify your shared object in a way that bypasses all of the locking that makes DictProxy
safe to use.
Here is some evidence that you can't access mutable objects, which is similar to what is going on with setdefault()
:
>>> d['foo'] = []
>>> foo = d['foo']
>>> id(d['foo'])
140336914055536
>>> id(foo)
140336914056184
With a normal dictionary you would expect d['foo']
and foo
to point to the same list object, and modifications to one would modify the other. As you have seen, this is not the case for the DictProxy
class because of the additional process safety requirement imposed by the multiprocessing module.
edit: The following note from the multiprocessing documentation clarifies what I was trying to say above:
Note: Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified. To modify such an item, you can re-assign the modified object to the container proxy:
# create a list proxy and append a mutable object (a dictionary)
lproxy = manager.list()
lproxy.append({})
# now mutate the dictionary
d = lproxy[0]
d['a'] = 1
d['b'] = 2
# at this point, the changes to d are not yet synced, but by
# reassigning the dictionary, the proxy is notified of the change
lproxy[0] = d
Based on the above information, here is how you could rewrite your original code to work with a DictProxy
:
# d.setdefault('foo', []).append({'bar': 'baz'})
d['foo'] = d.get('foo', []) + [{'bar': 'baz'}]
As Edward Loper suggested in comments, edited above code to use get()
instead of setdefault()
.
The Manager().dict() is a DictProxy object:
>>> mgr.dict()
<DictProxy object, typeid 'dict' at 0x1007bab50>
>>> type(mgr.dict())
<class 'multiprocessing.managers.DictProxy'>
DictProxy is a subclass of the BaseProxy type, which does not behave entirely like a normal dict: http://docs.python.org/library/multiprocessing.html?highlight=multiprocessing#multiprocessing.managers.BaseProxy
So, it seems you have to address the mgr.dict() differently than you would a base dict.
items() returns a copy. Appending to a copy does not affect the original. Did you mean this?
>>> d['foo'] =({'bar': 'baz'})
>>> print d.items()
[('foo', {'bar': 'baz'})]