Flatten nested dictionaries, compressing keys

前端 未结 28 2239
遇见更好的自我
遇见更好的自我 2020-11-22 01:16

Suppose you have a dictionary like:

{\'a\': 1,
 \'c\': {\'a\': 2,
       \'b\': {\'x\': 5,
             \'y\' : 10}},
 \'d\': [1, 2, 3]}

Ho

相关标签:
28条回答
  • 2020-11-22 01:42

    Here's an algorithm for elegant, in-place replacement. Tested with Python 2.7 and Python 3.5. Using the dot character as a separator.

    def flatten_json(json):
        if type(json) == dict:
            for k, v in list(json.items()):
                if type(v) == dict:
                    flatten_json(v)
                    json.pop(k)
                    for k2, v2 in v.items():
                        json[k+"."+k2] = v2
    

    Example:

    d = {'a': {'b': 'c'}}                   
    flatten_json(d)
    print(d)
    unflatten_json(d)
    print(d)
    

    Output:

    {'a.b': 'c'}
    {'a': {'b': 'c'}}
    

    I published this code here along with the matching unflatten_json function.

    0 讨论(0)
  • 2020-11-22 01:42

    If you want to flat nested dictionary and want all unique keys list then here is the solution:

    def flat_dict_return_unique_key(data, unique_keys=set()):
        if isinstance(data, dict):
            [unique_keys.add(i) for i in data.keys()]
            for each_v in data.values():
                if isinstance(each_v, dict):
                    flat_dict_return_unique_key(each_v, unique_keys)
        return list(set(unique_keys))
    
    0 讨论(0)
  • 2020-11-22 01:44

    My Python 3.3 Solution using generators:

    def flattenit(pyobj, keystring=''):
       if type(pyobj) is dict:
         if (type(pyobj) is dict):
             keystring = keystring + "_" if keystring else keystring
             for k in pyobj:
                 yield from flattenit(pyobj[k], keystring + k)
         elif (type(pyobj) is list):
             for lelm in pyobj:
                 yield from flatten(lelm, keystring)
       else:
          yield keystring, pyobj
    
    my_obj = {'a': 1, 'c': {'a': 2, 'b': {'x': 5, 'y': 10}}, 'd': [1, 2, 3]}
    
    #your flattened dictionary object
    flattened={k:v for k,v in flattenit(my_obj)}
    print(flattened)
    
    # result: {'c_b_y': 10, 'd': [1, 2, 3], 'c_a': 2, 'a': 1, 'c_b_x': 5}
    
    0 讨论(0)
  • 2020-11-22 01:45

    I actually wrote a package called cherrypicker recently to deal with this exact sort of thing since I had to do it so often!

    I think the following code would give you exactly what you're after:

    from cherrypicker import CherryPicker
    
    dct = {
        'a': 1,
        'c': {
            'a': 2,
            'b': {
                'x': 5,
                'y' : 10
            }
        },
        'd': [1, 2, 3]
    }
    
    picker = CherryPicker(dct)
    picker.flatten().get()
    

    You can install the package with:

    pip install cherrypicker
    

    ...and there's more docs and guidance at https://cherrypicker.readthedocs.io.

    Other methods may be faster, but the priority of this package is to make such tasks easy. If you do have a large list of objects to flatten though, you can also tell CherryPicker to use parallel processing to speed things up.

    0 讨论(0)
  • 2020-11-22 01:45

    If you do not mind recursive functions, here is a solution. I have also taken the liberty to include an exclusion-parameter in case there are one or more values you wish to maintain.

    Code:

    def flatten_dict(dictionary, exclude = [], delimiter ='_'):
        flat_dict = dict()
        for key, value in dictionary.items():
            if isinstance(value, dict) and key not in exclude:
                flatten_value_dict = flatten_dict(value, exclude, delimiter)
                for k, v in flatten_value_dict.items():
                    flat_dict[f"{key}{delimiter}{k}"] = v
            else:
                flat_dict[key] = value
        return flat_dict
    

    Usage:

    d = {'a':1, 'b':[1, 2], 'c':3, 'd':{'a':4, 'b':{'a':7, 'b':8}, 'c':6}, 'e':{'a':1,'b':2}}
    flat_d = flatten_dict(dictionary=d, exclude=['e'], delimiter='.')
    print(flat_d)
    

    Output:

    {'a': 1, 'b': [1, 2], 'c': 3, 'd.a': 4, 'd.b.a': 7, 'd.b.b': 8, 'd.c': 6, 'e': {'a': 1, 'b': 2}}
    
    0 讨论(0)
  • 2020-11-22 01:46

    This is not restricted to dictionaries, but every mapping type that implements .items(). Further ist faster as it avoides an if condition. Nevertheless credits go to Imran:

    def flatten(d, parent_key=''):
        items = []
        for k, v in d.items():
            try:
                items.extend(flatten(v, '%s%s_' % (parent_key, k)).items())
            except AttributeError:
                items.append(('%s%s' % (parent_key, k), v))
        return dict(items)
    
    0 讨论(0)
提交回复
热议问题