Flatten nested dictionaries, compressing keys

前端 未结 28 2237
遇见更好的自我
遇见更好的自我 2020-11-22 01:16

Suppose you have a dictionary like:

{\'a\': 1,
 \'c\': {\'a\': 2,
       \'b\': {\'x\': 5,
             \'y\' : 10}},
 \'d\': [1, 2, 3]}

Ho

相关标签:
28条回答
  • 2020-11-22 01:49

    I tried some of the solutions on this page - though not all - but those I tried failed to handle the nested list of dict.

    Consider a dict like this:

    d = {
            'owner': {
                'name': {'first_name': 'Steven', 'last_name': 'Smith'},
                'lottery_nums': [1, 2, 3, 'four', '11', None],
                'address': {},
                'tuple': (1, 2, 'three'),
                'tuple_with_dict': (1, 2, 'three', {'is_valid': False}),
                'set': {1, 2, 3, 4, 'five'},
                'children': [
                    {'name': {'first_name': 'Jessica',
                              'last_name': 'Smith', },
                     'children': []
                     },
                    {'name': {'first_name': 'George',
                              'last_name': 'Smith'},
                     'children': []
                     }
                ]
            }
        }
    

    Here's my makeshift solution:

    def flatten_dict(input_node: dict, key_: str = '', output_dict: dict = {}):
        if isinstance(input_node, dict):
            for key, val in input_node.items():
                new_key = f"{key_}.{key}" if key_ else f"{key}"
                flatten_dict(val, new_key, output_dict)
        elif isinstance(input_node, list):
            for idx, item in enumerate(input_node):
                flatten_dict(item, f"{key_}.{idx}", output_dict)
        else:
            output_dict[key_] = input_node
        return output_dict
    

    which produces:

    {
      owner.name.first_name: Steven,
      owner.name.last_name: Smith,
      owner.lottery_nums.0: 1,
      owner.lottery_nums.1: 2,
      owner.lottery_nums.2: 3,
      owner.lottery_nums.3: four,
      owner.lottery_nums.4: 11,
      owner.lottery_nums.5: None,
      owner.tuple: (1, 2, 'three'),
      owner.tuple_with_dict: (1, 2, 'three', {'is_valid': False}),
      owner.set: {1, 2, 3, 4, 'five'},
      owner.children.0.name.first_name: Jessica,
      owner.children.0.name.last_name: Smith,
      owner.children.1.name.first_name: George,
      owner.children.1.name.last_name: Smith,
    }
    

    A makeshift solution and it's not perfect.
    NOTE:

    • it doesn't keep empty dicts such as the address: {} k/v pair.

    • it won't flatten dicts in nested tuples - though it would be easy to add using the fact that python tuples act similar to lists.

    0 讨论(0)
  • 2020-11-22 01:50

    Or if you are already using pandas, You can do it with json_normalize() like so:

    import pandas as pd
    
    d = {'a': 1,
         'c': {'a': 2, 'b': {'x': 5, 'y' : 10}},
         'd': [1, 2, 3]}
    
    df = pd.io.json.json_normalize(d, sep='_')
    
    print(df.to_dict(orient='records')[0])
    

    Output:

    {'a': 1, 'c_a': 2, 'c_b_x': 5, 'c_b_y': 10, 'd': [1, 2, 3]}
    
    0 讨论(0)
  • 2020-11-22 01:51

    Not exactly what the OP asked, but lots of folks are coming here looking for ways to flatten real-world nested JSON data which can have nested key-value json objects and arrays and json objects inside the arrays and so on. JSON doesn't include tuples, so we don't have to fret over those.

    I found an implementation of the list-inclusion comment by @roneo to the answer posted by @Imran :

    https://github.com/ScriptSmith/socialreaper/blob/master/socialreaper/tools.py#L8

    import collections
    def flatten(dictionary, parent_key=False, separator='.'):
        """
        Turn a nested dictionary into a flattened dictionary
        :param dictionary: The dictionary to flatten
        :param parent_key: The string to prepend to dictionary's keys
        :param separator: The string used to separate flattened keys
        :return: A flattened dictionary
        """
    
        items = []
        for key, value in dictionary.items():
            new_key = str(parent_key) + separator + key if parent_key else key
            if isinstance(value, collections.MutableMapping):
                items.extend(flatten(value, new_key, separator).items())
            elif isinstance(value, list):
                for k, v in enumerate(value):
                    items.extend(flatten({str(k): v}, new_key).items())
            else:
                items.append((new_key, value))
        return dict(items)
    

    Test it:

    flatten({'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3] })
    
    >> {'a': 1, 'c.a': 2, 'c.b.x': 5, 'c.b.y': 10, 'd.0': 1, 'd.1': 2, 'd.2': 3}
    

    Annd that does the job I need done: I throw any complicated json at this and it flattens it out for me.

    All credits to https://github.com/ScriptSmith .

    0 讨论(0)
  • 2020-11-22 01:52

    I always prefer access dict objects via .items(), so for flattening dicts I use the following recursive generator flat_items(d). If you like to have dict again, simply wrap it like this: flat = dict(flat_items(d))

    def flat_items(d, key_separator='.'):
        """
        Flattens the dictionary containing other dictionaries like here: https://stackoverflow.com/questions/6027558/flatten-nested-python-dictionaries-compressing-keys
    
        >>> example = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y' : 10}}, 'd': [1, 2, 3]}
        >>> flat = dict(flat_items(example, key_separator='_'))
        >>> assert flat['c_b_y'] == 10
        """
        for k, v in d.items():
            if type(v) is dict:
                for k1, v1 in flat_items(v, key_separator=key_separator):
                    yield key_separator.join((k, k1)), v1
            else:
                yield k, v
    
    0 讨论(0)
提交回复
热议问题