pandas groupby to nested json

后端 未结 4 908
再見小時候
再見小時候 2020-11-27 05:47

I often use pandas groupby to generate stacked tables. But then I often want to output the resulting nested relations to json. Is there any way to extract a nested json fil

相关标签:
4条回答
  • 2020-11-27 06:23

    I had a look at the solution above and figured out that it only works for 3 levels of nesting. This solution will work for any number of levels.

    import json
    levels = len(grouped.index.levels)
    dicts = [{} for i in range(levels)]
    last_index = None
    
    for index,value in grouped.itertuples():
    
        if not last_index:
            last_index = index
    
        for (ii,(i,j)) in enumerate(zip(index, last_index)):
            if not i == j:
                ii = levels - ii -1
                dicts[:ii] =  [{} for _ in dicts[:ii]]
                break
    
        for i, key in enumerate(reversed(index)):
            dicts[i][key] = value
            value = dicts[i]
    
        last_index = index
    
    
    result = json.dumps(dicts[-1])
    
    0 讨论(0)
  • 2020-11-27 06:29

    I'm aware this is an old question, but I came across the same issue recently. Here's my solution. I borrowed a lot of stuff from chrisb's example (Thank you!).

    This has the advantage that you can pass a lambda to get the final value from whatever enumerable you want, as well as for each group.

    from collections import defaultdict
    
    def dict_from_enumerable(enumerable, final_value, *groups):
        d = defaultdict(lambda: defaultdict(dict))
        group_count = len(groups)
        for item in enumerable:
            nested = d
            item_result = final_value(item) if callable(final_value) else item.get(final_value)
            for i, group in enumerate(groups, start=1):
                group_val = str(group(item) if callable(group) else item.get(group))
                if i == group_count:
                    nested[group_val] = item_result
                else:
                    nested = nested[group_val]
        return d
    

    In the question, you'd call this function like:

    dict_from_enumerable(grouped.itertuples(), 'amount', 'year', 'office', 'candidate')
    

    The first argument can be an array of data as well, not even requiring pandas.

    0 讨论(0)
  • 2020-11-27 06:37

    Here is a generic recursive solution for this problem:

    def df_to_dict(df):
        if df.ndim == 1:
            return df.to_dict()
    
        ret = {}
        for key in df.index.get_level_values(0):
            sub_df = df.xs(key)
            ret[key] = df_to_dict(sub_df)
        return ret
    
    0 讨论(0)
  • 2020-11-27 06:44

    I don't think think there is anything built-in to pandas to create a nested dictionary of the data. Below is some code that should work in general for a series with a MultiIndex, using a defaultdict

    The nesting code iterates through each level of the MultIndex, adding layers to the dictionary until the deepest layer is assigned to the Series value.

    In  [99]: from collections import defaultdict
    
    In [100]: results = defaultdict(lambda: defaultdict(dict))
    
    In [101]: for index, value in grouped.itertuples():
         ...:     for i, key in enumerate(index):
         ...:         if i == 0:
         ...:             nested = results[key]
         ...:         elif i == len(index) - 1:
         ...:             nested[key] = value
         ...:         else:
         ...:             nested = nested[key]
    
    In [102]: results
    Out[102]: defaultdict(<function <lambda> at 0x7ff17c76d1b8>, {2010: defaultdict(<type 'dict'>, {'govnr': {'pati mara': 500.0, 'jess rapp': 80.0}, 'mayor': {'joe smith': 100.0, 'jay gould': 12.0}})})
    
    In [106]: print json.dumps(results, indent=4)
    {
        "2010": {
            "govnr": {
                "pati mara": 500.0, 
                "jess rapp": 80.0
            }, 
            "mayor": {
                "joe smith": 100.0, 
                "jay gould": 12.0
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题