问题
I have a defaultdict(list) object that is of this structure:
{id: [list[list]]}
for example,
'a1': [[0.01, 'cat']],
'a2': [[0.09, 'cat']],
'a3': [[0.5, 'dog']],
...
I'd like to conver this defaultdict(list)
into a Pandas DataFrame
object.
I tried with the following:
df = pd.DataFrame(list(my_dict.items()), columns=['id', 'category'])
However, I faced a problem with my 'category' column. This is a column of list of list. I'm trying to split out the 2 values in the 'category' into 2 separate columns. So my final DataFrame columns would be ['id', 'score', 'category'].
When I tried with below Apply function:
db['category'].apply(lambda x: x[0][0])
I got an error for 'list index out of range'.
What could be wrong with my code? How shall I create the 2 new columns from a list of lists?
Thank you.
回答1:
I believe you need:
df = pd.DataFrame([[k] + v[0] for k, v in my_dict.items()],
columns=['id', 'score', 'category'])
Or:
df = pd.DataFrame([(k, v[0][0], v[0][1]) for k, v in my_dict.items()],
columns=['id', 'score', 'category'])
回答2:
Using a list comprehension
Ex:
import pandas as pd
d = {'a1': [[0.01, 'cat']], 'a2': [[0.09, 'cat']],'a3': [[0.5, 'dog']]}
df = pd.DataFrame([[k] + j for k,v in d.items() for j in v], columns=['id', 'score', 'category'])
print(df)
Output:
id score category
0 a1 0.01 cat
1 a3 0.50 dog
2 a2 0.09 cat
来源:https://stackoverflow.com/questions/54122942/how-to-convert-a-defaultdictlist-to-pandas-dataframe