Combine multiple dictionaries into one pandas dataframe in long format

筅森魡賤 提交于 2021-02-07 21:48:18

问题


I have several dictionaries set up as follows:

Dict1 = {'Orange': ['1', '2', '3', '4']}
Dict2 = {'Red': ['3', '4', '5']}

And I'd like the output to be one combined dataframe:

| Type | Value |
|--------------|
|Orange|   1   |
|Orange|   2   |
|Orange|   3   |
|Orange|   4   |
| Red  |   3   |
| Red  |   4   |
| Red  |   5   |

I tried splitting everything out but I only get Dict2 in this dataframe.

mydicts = [Dict1, Dict2]
for x in mydicts:
    for k, v in x.items():
        df = pd.DataFrame(v)
        df['Type'] = k

回答1:


One option is using pd.concat:

pd.concat(map(pd.DataFrame, mydicts), axis=1).melt().dropna()

  variable value
0   Orange     1
1   Orange     2
2   Orange     3
3   Orange     4
4      Red     3
5      Red     4
6      Red     5

If performance matters, you can initialise a single DataFrame using DataFrame.from_dict and melt:

pd.DataFrame.from_dict({**Dict1, **Dict2}, orient='index').T.melt().dropna()
  variable value
0   Orange     1
1   Orange     2
2   Orange     3
3   Orange     4
4      Red     3
5      Red     4
6      Red     5

Or, using stack instead of melt (slightly slower, just for completeness):

res = (pd.DataFrame.from_dict({**Dict1, **Dict2}, orient='index').T
         .stack()
         .reset_index(level=1)
         .sort_values('level_1'))
res.columns = ['Type', 'Value']

print(res)
     Type Value
0  Orange     1
1  Orange     2
2  Orange     3
3  Orange     4
0     Red     3
1     Red     4
2     Red     5

The dictionary unpacking syntax works with python3.6. On older versions, replace {**d1, **d2} with {k: v for d in mydicts for k, v in d.items()}.




回答2:


Comprehension

pd.DataFrame(
    [(t, v)
     for t, V in {**Dict1, **Dict2}.items()
     for v in V],
    columns=['Type', 'Value']
)

     Type Value
0  Orange     1
1  Orange     2
2  Orange     3
3  Orange     4
4     Red     3
5     Red     4
6     Red     5



回答3:


After stack , it become a unnest problem

s=pd.DataFrame(mydicts).stack().reset_index(level=1)
unnesting(s,[0])
Out[829]: 
   0 level_1
0  1  Orange
0  2  Orange
0  3  Orange
0  4  Orange
1  3     Red
1  4     Red
1  5     Red


来源:https://stackoverflow.com/questions/53601657/combine-multiple-dictionaries-into-one-pandas-dataframe-in-long-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!