Unfold a nested dictionary with lists into a pandas DataFrame

后端 未结 2 1629
陌清茗
陌清茗 2021-01-18 18:27

I have a nested dictionary, whereby the sub-dictionary use lists:

nested_dict = {\'string1\': {69: [1231, 232], 67:[682, 12], 65: [1, 1]}, 
    `string2` :{2         


        
2条回答
  •  天涯浪人
    2021-01-18 19:08

    Here's a method which uses a recursive generator to unroll the nested dictionaries. It won't assume that you have exactly two levels, but continues unrolling each dict until it hits a list.

    nested_dict = {
        'string1': {69: [1231, 232], 67:[682, 12], 65: [1, 1]}, 
        'string2' :{28672: [82, 23], 22736:[82, 93, 1102, 102], 19423: [64, 23]},
        'string3': [101, 102]}
    
    def unroll(data):
        if isinstance(data, dict):
            for key, value in data.items():
                # Recursively unroll the next level and prepend the key to each row.
                for row in unroll(value):
                    yield [key] + row
        if isinstance(data, list):
            # This is the bottom of the structure (defines exactly one row).
            yield data
    
    df = pd.DataFrame(list(unroll(nested_dict)))
    

    Because unroll produces a list of lists rather than dicts, the columns will be named numerically (from 0 to 5 in this case). So you need to use rename to get the column labels you want:

    df.rename(columns=lambda i: 'col{}'.format(i+1))
    

    This returns the following result (note that the additional string3 entry is also unrolled).

          col1   col2  col3   col4    col5   col6
    0  string1     69  1231  232.0     NaN    NaN
    1  string1     67   682   12.0     NaN    NaN
    2  string1     65     1    1.0     NaN    NaN
    3  string2  28672    82   23.0     NaN    NaN
    4  string2  22736    82   93.0  1102.0  102.0
    5  string2  19423    64   23.0     NaN    NaN
    6  string3    101   102    NaN     NaN    NaN
    

提交回复
热议问题