Flatten nested dictionaries, compressing keys

前端 未结 28 2293
遇见更好的自我
遇见更好的自我 2020-11-22 01:16

Suppose you have a dictionary like:

{\'a\': 1,
 \'c\': {\'a\': 2,
       \'b\': {\'x\': 5,
             \'y\' : 10}},
 \'d\': [1, 2, 3]}

Ho

28条回答
  •  后悔当初
    2020-11-22 01:48

    Utilizing recursion, keeping it simple and human readable:

    def flatten_dict(dictionary, accumulator=None, parent_key=None, separator="."):
        if accumulator is None:
            accumulator = {}
    
        for k, v in dictionary.items():
            k = f"{parent_key}{separator}{k}" if parent_key else k
            if isinstance(v, dict):
                flatten_dict(dictionary=v, accumulator=accumulator, parent_key=k)
                continue
    
            accumulator[k] = v
    
        return accumulator
    

    Call is simple:

    new_dict = flatten_dict(dictionary)
    

    or

    new_dict = flatten_dict(dictionary, separator="_")
    

    if we want to change the default separator.

    A little breakdown:

    When the function is first called, it is called only passing the dictionary we want to flatten. The accumulator parameter is here to support recursion, which we see later. So, we instantiate accumulator to an empty dictionary where we will put all of the nested values from the original dictionary.

    if accumulator is None:
        accumulator = {}
    

    As we iterate over the dictionary's values, we construct a key for every value. The parent_key argument will be None for the first call, while for every nested dictionary, it will contain the key pointing to it, so we prepend that key.

    k = f"{parent_key}{separator}{k}" if parent_key else k
    

    In case the value v the key k is pointing to is a dictionary, the function calls itself, passing the nested dictionary, the accumulator (which is passed by reference, so all changes done to it are done on the same instance) and the key k so that we can construct the concatenated key. Notice the continue statement. We want to skip the next line, outside of the if block, so that the nested dictionary doesn't end up in the accumulator under key k.

    if isinstance(v, dict):
        flatten_dict(dict=v, accumulator=accumulator, parent_key=k)
        continue
    

    So, what do we do in case the value v is not a dictionary? Just put it unchanged inside the accumulator.

    accumulator[k] = v
    

    Once we're done we just return the accumulator, leaving the original dictionary argument untouched.

    NOTE

    This will work only with dictionaries that have strings as keys. It will work with hashable objects implementing the __repr__ method, but will yield unwanted results.

提交回复
热议问题