问题
I have several dictionaries set up as follows:
Dict1 = {'Orange': ['1', '2', '3', '4']}
Dict2 = {'Red': ['3', '4', '5']}
And I'd like the output to be one combined dataframe:
| Type | Value |
|--------------|
|Orange| 1 |
|Orange| 2 |
|Orange| 3 |
|Orange| 4 |
| Red | 3 |
| Red | 4 |
| Red | 5 |
I tried splitting everything out but I only get Dict2 in this dataframe.
mydicts = [Dict1, Dict2]
for x in mydicts:
for k, v in x.items():
df = pd.DataFrame(v)
df['Type'] = k
回答1:
One option is using pd.concat
:
pd.concat(map(pd.DataFrame, mydicts), axis=1).melt().dropna()
variable value
0 Orange 1
1 Orange 2
2 Orange 3
3 Orange 4
4 Red 3
5 Red 4
6 Red 5
If performance matters, you can initialise a single DataFrame using DataFrame.from_dict
and melt
:
pd.DataFrame.from_dict({**Dict1, **Dict2}, orient='index').T.melt().dropna()
variable value
0 Orange 1
1 Orange 2
2 Orange 3
3 Orange 4
4 Red 3
5 Red 4
6 Red 5
Or, using stack
instead of melt
(slightly slower, just for completeness):
res = (pd.DataFrame.from_dict({**Dict1, **Dict2}, orient='index').T
.stack()
.reset_index(level=1)
.sort_values('level_1'))
res.columns = ['Type', 'Value']
print(res)
Type Value
0 Orange 1
1 Orange 2
2 Orange 3
3 Orange 4
0 Red 3
1 Red 4
2 Red 5
The dictionary unpacking syntax works with python3.6. On older versions, replace {**d1, **d2}
with {k: v for d in mydicts for k, v in d.items()}
.
回答2:
Comprehension
pd.DataFrame(
[(t, v)
for t, V in {**Dict1, **Dict2}.items()
for v in V],
columns=['Type', 'Value']
)
Type Value
0 Orange 1
1 Orange 2
2 Orange 3
3 Orange 4
4 Red 3
5 Red 4
6 Red 5
回答3:
After stack
, it become a unnest problem
s=pd.DataFrame(mydicts).stack().reset_index(level=1)
unnesting(s,[0])
Out[829]:
0 level_1
0 1 Orange
0 2 Orange
0 3 Orange
0 4 Orange
1 3 Red
1 4 Red
1 5 Red
来源:https://stackoverflow.com/questions/53601657/combine-multiple-dictionaries-into-one-pandas-dataframe-in-long-format