Convert dataframe to JSON with 2 level nested array

后端 未结 1 1048
我在风中等你
我在风中等你 2021-01-27 21:42

I am a bit new to Python programming. I have a small requirement where in I need to list down all customers and their amounts for a given fortnight in a JSON format.

Cu

1条回答
  •  时光取名叫无心
    2021-01-27 22:15

    Start by grouping on both the Parameter and FortNight columns, and using .to_dict() on the resulting grouped rows to produce the inner-most dictionaries:

    details = df.groupby(['Parameter', 'FortNight']).apply(
        lambda r: r[['Customer', 'Amount']].to_dict(orient='records'))
    

    This gives you a series with a multi-index over Parameter and FortNight, and the values are all the lists in the correct format, each entry a dictionary with Customer and Amount columns. If you need to convert the value types, do so on the r[['Customer', 'Amount']] dataframe result before calling to_dict() on it.

    You can then unstack the series into a dataframe, giving you a nested Parameter -> FortNight -> details structure; the Parameter values become columns, each list of Customer / Amount dictionaries indexed by FortNight:

    nested = details.unstack('Parameter')
    

    If you turn this into a dictionary, you'd get a dictionary that's mostly correct already:

    >>> pprint(grouped.unstack('Parameter').to_dict())
    {'CustomerSales': {'Apr-2FN-2018': [{'Amount': 339632.0, 'Customer': '10992'},
                                        {'Amount': 27282.0, 'Customer': '10994'},
                                        {'Amount': 26353.0, 'Customer': '10995'},
                                        {'Amount': 24797.0, 'Customer': '11000'},
                                        {'Amount': 21093.0, 'Customer': '10990'}]}}
    

    but for your format, you'd convert the values in each column to a list of {'FortNight': indexvalue, 'Details': value} mappings, then converting the whole structure to a dictionary:

    output = nested.apply(lambda s: [
        {s.index.name: idx, 'Details': value}
        for idx, value in s.items()
    ]).to_dict('records')
    

    This gives you your final output:

    >>> pprint(output)
    [{'CustomerSales': {'Details': [{'Amount': 339632.0, 'Customer': '10992'},
                                    {'Amount': 27282.0, 'Customer': '10994'},
                                    {'Amount': 26353.0, 'Customer': '10995'},
                                    {'Amount': 24797.0, 'Customer': '11000'},
                                    {'Amount': 21093.0, 'Customer': '10990'}],
                        'FortNight': 'Apr-2FN-2018'}}]
    

    If you need a JSON document, use .to_json(orient='records') rather than .to_dict('records').

    Put together as one expression:

    df.groupby(['Parameter', 'FortNight']).apply(
            lambda r: r[['Customer', 'Amount']].to_dict(orient='records')
        ).unstack('Parameter').apply(lambda s: [
            {s.index.name: idx, 'Details': value}
            for idx, value in s.items()]
        ).to_json(orient='records')
    

    0 讨论(0)
提交回复
热议问题