Pandas to D3. Serializing dataframes to JSON

左心房为你撑大大i 提交于 2021-02-06 04:27:24

问题


I have a DataFrame with the following columns and no duplicates:

['region', 'type', 'name', 'value']

that can be seen as a hierarchy as follows

grouped = df.groupby(['region','type', 'name'])

I would like to serialize this hierarchy as a JSON object.

If anyone is interested, the motivation behind this is to eventually put together a visualization like this one which requires a JSON file.

To do so, I need to convert grouped into the following:

new_data['children'][i]['name'] = region
new_data['children'][i]['children'][j]['name'] = type
new_data['children'][i]['children'][j]'children'][k]['name'] = name
new_data['children'][i]['children'][j]'children'][k]['size'] = value
...

where region, type, name correspond to different levels of the hierarchy (indexed by i, j and k)

Is there an easy way in Pandas/Python to do this?


回答1:


Something along these lines might get you there.

from collections import defaultdict

tree = lambda: defaultdict(tree)  # a recursive defaultdict
d = tree()
for _, (region, type, name, value) in df.iterrows():
    d['children'][region]['name'] = region
    ...

json.dumps(d)

A vectorized solution would be better, and maybe something that takes advantage of the speed of groupby, but I can't think of such a solution.

Also take a look at df.groupby(...).groups, which return a dict.

See also this answer.




回答2:


Here's another script to take a pandas df and output a flare.json file: https://github.com/andrewheekin/csv2flare.json



来源:https://stackoverflow.com/questions/23531145/pandas-to-d3-serializing-dataframes-to-json

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!