Using defaultdict with multiprocessing?

后端 未结 2 929
天命终不由人
天命终不由人 2021-02-04 13:21

Just experimenting and learning, and I know how to create a shared dictionary that can be accessed with multiple proceses but I\'m not sure how to keep the dict synced. de

相关标签:
2条回答
  • 2021-02-04 13:56

    You can subclass BaseManager and register additional types for sharing. You need to provide a suitable proxy type in cases where the default AutoProxy-generated type does not work. For defaultdict, if you only need to access the attributes that are already present in dict, you can use DictProxy.

    from multiprocessing import Pool
    from multiprocessing.managers import BaseManager, DictProxy
    from collections import defaultdict
    
    class MyManager(BaseManager):
        pass
    
    MyManager.register('defaultdict', defaultdict, DictProxy)
    
    def test(k, multi_dict):
        multi_dict[k] += 1
    
    if __name__ == '__main__':
        pool = Pool(processes=4)
        mgr = MyManager()
        mgr.start()
        multi_d = mgr.defaultdict(int)
        for k in 'mississippi':
            pool.apply_async(test, (k, multi_d))
        pool.close()
        pool.join()
        print multi_d.items()
    
    0 讨论(0)
  • 2021-02-04 13:58

    Well, the Manager class seems to supply only a fixed number of predefined data structures which can be shared among processes, and defaultdict is not among them. If you really just need that one defaultdict, the easiest solution would be to implement the defaulting behavior on your own:

    def test(k, multi_dict):
        if k not in multi_dict:
            multi_dict[k] = 0
        multi_dict[k] += 1
    
    0 讨论(0)
提交回复
热议问题