问题
I'm trying to learn how pandas works but I assume I'm missing something obvious.
I have a file looking like this :
dict_spl ={'doc1':[[('word11',1,1),('word12',1,2)]], 'doc2':[[('word21',2,1),('word22',2,2)]]}
And I'm trying to obtain a pandas
DataFrame looking like this:
# doc1 word11 1 1
# doc1 word12 1 2
# doc2 word21 2 1
# doc2 word22 2 2
I haven't found a way to create both new columns and new rows while duplicating the common values.
回答1:
You can use:
a = [[(k, *y) for y in v[0]] for k,v in dict_spl.items()]
a = [item for sublist in a for item in sublist]
df = pd.DataFrame(a, columns=list('abcd'))
print (df)
a b c d
0 doc1 word11 1 1
1 doc1 word12 1 2
2 doc2 word21 2 1
3 doc2 word22 2 2
I feel there is better solution, so I asked here:
#Martijn Pieters♦'s solution
a = [(k, *t) for k, v in dict_spl.items() for t in v[0]]
df = pd.DataFrame(a, columns=list('abcd'))
print (df)
a b c d
0 doc2 word21 2 1
1 doc2 word22 2 2
2 doc1 word11 1 1
3 doc1 word12 1 2
来源:https://stackoverflow.com/questions/45215820/dictionary-of-nested-lists-to-pandas-dataframe