How to convert a defaultdict(list) to Pandas DataFrame

ぐ巨炮叔叔 提交于 2021-01-27 12:26:58

问题


I have a defaultdict(list) object that is of this structure:

{id: [list[list]]}

for example,

'a1': [[0.01, 'cat']],

'a2': [[0.09, 'cat']],

'a3': [[0.5, 'dog']],

...

I'd like to conver this defaultdict(list) into a Pandas DataFrame object.

I tried with the following:

df = pd.DataFrame(list(my_dict.items()), columns=['id', 'category'])

However, I faced a problem with my 'category' column. This is a column of list of list. I'm trying to split out the 2 values in the 'category' into 2 separate columns. So my final DataFrame columns would be ['id', 'score', 'category'].

When I tried with below Apply function:

db['category'].apply(lambda x: x[0][0])

I got an error for 'list index out of range'.

What could be wrong with my code? How shall I create the 2 new columns from a list of lists?

Thank you.


回答1:


I believe you need:

df = pd.DataFrame([[k] + v[0] for k, v in my_dict.items()], 
                   columns=['id', 'score', 'category'])

Or:

df = pd.DataFrame([(k, v[0][0], v[0][1]) for k, v in my_dict.items()], 
                   columns=['id', 'score', 'category'])



回答2:


Using a list comprehension

Ex:

import pandas as pd
d = {'a1': [[0.01, 'cat']], 'a2': [[0.09, 'cat']],'a3': [[0.5, 'dog']]}


df = pd.DataFrame([[k] + j for k,v in d.items() for j in v], columns=['id', 'score', 'category'])
print(df)

Output:

   id  score category
0  a1   0.01      cat
1  a3   0.50      dog
2  a2   0.09      cat


来源:https://stackoverflow.com/questions/54122942/how-to-convert-a-defaultdictlist-to-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!