Is multiprocessing.Manager().dict().setdefault() broken?

后端 未结 3 1932
抹茶落季
抹茶落季 2021-01-05 03:00

The its-late-and-im-probably-stupid department presents:

>>> import multiprocessing
>>> mgr = multiprocessing.Manager()
>>> d = mg         


        
相关标签:
3条回答
  • 2021-01-05 03:45

    This is some pretty interesting behavior, I am not exactly sure how it works but I'll take a crack at why the behavior is the way it is.

    First, note that multiprocessing.Manager().dict() is not a dict, it is a DictProxy object:

    >>> d = multiprocessing.Manager().dict()
    >>> d
    <DictProxy object, typeid 'dict' at 0x7fa2bbe8ea50>
    

    The purpose of the DictProxy class is to give you a dict that is safe to share across processes, which means that it must implement some locking on top of the normal dict functions.

    Apparently part of the implementation here is to not allow you to directly access mutable objects nested inside of a DictProxy, because if that was allowed you would be able to modify your shared object in a way that bypasses all of the locking that makes DictProxy safe to use.

    Here is some evidence that you can't access mutable objects, which is similar to what is going on with setdefault():

    >>> d['foo'] = []
    >>> foo = d['foo']
    >>> id(d['foo'])
    140336914055536
    >>> id(foo)
    140336914056184
    

    With a normal dictionary you would expect d['foo'] and foo to point to the same list object, and modifications to one would modify the other. As you have seen, this is not the case for the DictProxy class because of the additional process safety requirement imposed by the multiprocessing module.

    edit: The following note from the multiprocessing documentation clarifies what I was trying to say above:


    Note: Modifications to mutable values or items in dict and list proxies will not be propagated through the manager, because the proxy has no way of knowing when its values or items are modified. To modify such an item, you can re-assign the modified object to the container proxy:

    # create a list proxy and append a mutable object (a dictionary)
    lproxy = manager.list()
    lproxy.append({})
    # now mutate the dictionary
    d = lproxy[0]
    d['a'] = 1
    d['b'] = 2
    # at this point, the changes to d are not yet synced, but by
    # reassigning the dictionary, the proxy is notified of the change
    lproxy[0] = d
    

    Based on the above information, here is how you could rewrite your original code to work with a DictProxy:

    # d.setdefault('foo', []).append({'bar': 'baz'})
    d['foo'] = d.get('foo', []) + [{'bar': 'baz'}]
    

    As Edward Loper suggested in comments, edited above code to use get() instead of setdefault().

    0 讨论(0)
  • 2021-01-05 03:54

    The Manager().dict() is a DictProxy object:

    >>> mgr.dict()
    <DictProxy object, typeid 'dict' at 0x1007bab50>
    >>> type(mgr.dict())
    <class 'multiprocessing.managers.DictProxy'>
    

    DictProxy is a subclass of the BaseProxy type, which does not behave entirely like a normal dict: http://docs.python.org/library/multiprocessing.html?highlight=multiprocessing#multiprocessing.managers.BaseProxy

    So, it seems you have to address the mgr.dict() differently than you would a base dict.

    0 讨论(0)
  • 2021-01-05 03:55

    items() returns a copy. Appending to a copy does not affect the original. Did you mean this?

    >>> d['foo'] =({'bar': 'baz'})
    >>> print d.items()
    [('foo', {'bar': 'baz'})]
    
    0 讨论(0)
提交回复
热议问题