How to merge dictionaries of dictionaries?

后端 未结 29 2723
渐次进展
渐次进展 2020-11-22 05:13

I need to merge multiple dictionaries, here\'s what I have for instance:

dict1 = {1:{\"a\":{A}}, 2:{\"b\":{B}}}

dict2 = {2:{\"c\":{C}}, 3:{\"d\":{D}}


        
相关标签:
29条回答
  • 2020-11-22 05:52

    The code will depend on your rules for resolving merge conflicts, of course. Here's a version which can take an arbitrary number of arguments and merges them recursively to an arbitrary depth, without using any object mutation. It uses the following rules to resolve merge conflicts:

    • dictionaries take precedence over non-dict values ({"foo": {...}} takes precedence over {"foo": "bar"})
    • later arguments take precedence over earlier arguments (if you merge {"a": 1}, {"a", 2}, and {"a": 3} in order, the result will be {"a": 3})
    try:
        from collections import Mapping
    except ImportError:
        Mapping = dict
    
    def merge_dicts(*dicts):                                                            
        """                                                                             
        Return a new dictionary that is the result of merging the arguments together.   
        In case of conflicts, later arguments take precedence over earlier arguments.   
        """                                                                             
        updated = {}                                                                    
        # grab all keys                                                                 
        keys = set()                                                                    
        for d in dicts:                                                                 
            keys = keys.union(set(d))                                                   
    
        for key in keys:                                                                
            values = [d[key] for d in dicts if key in d]                                
            # which ones are mapping types? (aka dict)                                  
            maps = [value for value in values if isinstance(value, Mapping)]            
            if maps:                                                                    
                # if we have any mapping types, call recursively to merge them          
                updated[key] = merge_dicts(*maps)                                       
            else:                                                                       
                # otherwise, just grab the last value we have, since later arguments    
                # take precedence over earlier arguments                                
                updated[key] = values[-1]                                               
        return updated  
    
    0 讨论(0)
  • 2020-11-22 05:53

    Here's an easy way to do it using generators:

    def mergedicts(dict1, dict2):
        for k in set(dict1.keys()).union(dict2.keys()):
            if k in dict1 and k in dict2:
                if isinstance(dict1[k], dict) and isinstance(dict2[k], dict):
                    yield (k, dict(mergedicts(dict1[k], dict2[k])))
                else:
                    # If one of the values is not a dict, you can't continue merging it.
                    # Value from second dict overrides one in first and we move on.
                    yield (k, dict2[k])
                    # Alternatively, replace this with exception raiser to alert you of value conflicts
            elif k in dict1:
                yield (k, dict1[k])
            else:
                yield (k, dict2[k])
    
    dict1 = {1:{"a":"A"},2:{"b":"B"}}
    dict2 = {2:{"c":"C"},3:{"d":"D"}}
    
    print dict(mergedicts(dict1,dict2))
    

    This prints:

    {1: {'a': 'A'}, 2: {'c': 'C', 'b': 'B'}, 3: {'d': 'D'}}
    
    0 讨论(0)
  • 2020-11-22 05:53

    You could try mergedeep.


    Installation

    $ pip3 install mergedeep
    

    Usage

    from mergedeep import merge
    
    a = {"keyA": 1}
    b = {"keyB": {"sub1": 10}}
    c = {"keyB": {"sub2": 20}}
    
    merge(a, b, c) 
    
    print(a)
    # {"keyA": 1, "keyB": {"sub1": 10, "sub2": 20}}
    

    For a full list of options, check out the docs!

    0 讨论(0)
  • 2020-11-22 05:54

    There's a slight problem with andrew cookes answer: In some cases it modifies the second argument b when you modify the returned dict. Specifically it's because of this line:

    if key in a:
        ...
    else:
        a[key] = b[key]
    

    If b[key] is a dict, it will simply be assigned to a, meaning any subsequent modifications to that dict will affect both a and b.

    a={}
    b={'1':{'2':'b'}}
    c={'1':{'3':'c'}}
    merge(merge(a,b), c) # {'1': {'3': 'c', '2': 'b'}}
    a # {'1': {'3': 'c', '2': 'b'}} (as expected)
    b # {'1': {'3': 'c', '2': 'b'}} <----
    c # {'1': {'3': 'c'}} (unmodified)
    

    To fix this, the line would have to be substituted with this:

    if isinstance(b[key], dict):
        a[key] = clone_dict(b[key])
    else:
        a[key] = b[key]
    

    Where clone_dict is:

    def clone_dict(obj):
        clone = {}
        for key, value in obj.iteritems():
            if isinstance(value, dict):
                clone[key] = clone_dict(value)
            else:
                clone[key] = value
        return
    

    Still. This obviously doesn't account for list, set and other stuff, but I hope it illustrates the pitfalls when trying to merge dicts.

    And for completeness sake, here is my version, where you can pass it multiple dicts:

    def merge_dicts(*args):
        def clone_dict(obj):
            clone = {}
            for key, value in obj.iteritems():
                if isinstance(value, dict):
                    clone[key] = clone_dict(value)
                else:
                    clone[key] = value
            return
    
        def merge(a, b, path=[]):
            for key in b:
                if key in a:
                    if isinstance(a[key], dict) and isinstance(b[key], dict):
                        merge(a[key], b[key], path + [str(key)])
                    elif a[key] == b[key]:
                        pass
                    else:
                        raise Exception('Conflict at `{path}\''.format(path='.'.join(path + [str(key)])))
                else:
                    if isinstance(b[key], dict):
                        a[key] = clone_dict(b[key])
                    else:
                        a[key] = b[key]
            return a
        return reduce(merge, args, {})
    
    0 讨论(0)
  • 2020-11-22 05:56

    Short-n-sweet:

    from collections.abc import MutableMapping as Map
    
    def nested_update(d, v):
    """
    Nested update of dict-like 'd' with dict-like 'v'.
    """
    
    for key in v:
        if key in d and isinstance(d[key], Map) and isinstance(v[key], Map):
            nested_update(d[key], v[key])
        else:
            d[key] = v[key]
    

    This works like (and is build on) Python's dict.update method. It returns None (you can always add return d if you prefer) as it updates dict d in-place. Keys in v will overwrite any existing keys in d (it does not try to interpret the dict's contents).

    It will also work for other ("dict-like") mappings.

    0 讨论(0)
  • 2020-11-22 05:59

    Overview

    The following approach subdivides the problem of a deep merge of dicts into:

    1. A parameterized shallow merge function merge(f)(a,b) that uses a function f to merge two dicts a and b

    2. A recursive merger function f to be used together with merge


    Implementation

    A function for merging two (non nested) dicts can be written in a lot of ways. I personally like

    def merge(f):
        def merge(a,b): 
            keys = a.keys() | b.keys()
            return {key:f(a.get(key), b.get(key)) for key in keys}
        return merge
    

    A nice way of defining an appropriate recursive merger function f is using multipledispatch which allows to define functions that evaluate along different paths depending on the type of their arguments.

    from multipledispatch import dispatch
    
    #for anything that is not a dict return
    @dispatch(object, object)
    def f(a, b):
        return b if b is not None else a
    
    #for dicts recurse 
    @dispatch(dict, dict)
    def f(a,b):
        return merge(f)(a,b)
    

    Example

    To merge two nested dicts simply use merge(f) e.g.:

    dict1 = {1:{"a":"A"},2:{"b":"B"}}
    dict2 = {2:{"c":"C"},3:{"d":"D"}}
    merge(f)(dict1, dict2)
    #returns {1: {'a': 'A'}, 2: {'b': 'B', 'c': 'C'}, 3: {'d': 'D'}} 
    

    Notes:

    The advantages of this approach are:

    • The function is build from smaller functions that each do a single thing which makes the code simpler to reason about and test

    • The behaviour is not hard-coded but can be changed and extended as needed which improves code reuse (see example below).


    Customization

    Some answers also considered dicts that contain lists e.g. of other (potentially nested) dicts. In this case one might want map over the lists and merge them based on position. This can be done by adding another definition to the merger function f:

    import itertools
    @dispatch(list, list)
    def f(a,b):
        return [merge(f)(*arg) for arg in itertools.zip_longest(a, b)]
    
    0 讨论(0)
提交回复
热议问题