问题
I have this dataframe. This is a transaction of exchange goods. So basically the client can switch into those goods which have the same quality or lower. So for Client 1 on 16/08/2019, 360 B grade goods switch into 180 B and 180 A goods. This needs to be flagged. I can do this in excel, however, the file is too big and it crashes.
In/Out Client Quality Date GoodsAmount
In 1 A 16/08/2019 180
In 1 B 16/08/2019 180
Out 1 B 16/08/2019 360
In 2 C 14/08/2019 130
Out 2 B 14/08/2019 45
Out 2 C 14/08/2019 85
In 1 B 18/08/2019 80
In 1 A 18/08/2019 60
Out 1 A 18/08/2019 140
I want to create a new column which will be =GoodsAmount/SUMIFS(GoodsAmount by Client, Quality, In/Out, and Date)
I check on this URL: Pandas: Adding an excel SUMIF column like =A1/SUMIF(B:B,B1,A:A)
And my code is:
# df['Percentage']=df.GoodsAmount/(df.groupby(['In/Out','Client','Quality','Date'])['GoodsAmount'].transform('sum'))
However, there is an error.
In the end, what I have this table in mind.
| | | Switch in | Switch Out | |
|--------|-----------|:-----------:|:-:|:-:|:------------:|:-:|:-:|----------|
| Client | Date | A | B | C | A | B | C | Flagged? |
| 1 | 16-Aug-19 | | | | | | | |
| 1 | 18-Aug-19 | | | | | | | |
| 2 | 14-Aug-19 | | | | | | | |
| 2 | 20-Sep-19 | | | | | | | |
| 2 | 31-Oct-19 | | | | | | | |
| 3 | 11-Mar-19 | | | | | | | |
| 3 | 13-Feb-20 | | | | | | | |
| 3 | 12-Aug-20 | | | | | | | |
回答1:
Cannot help correct the SUMIFS code, but you don't need using SUMIFS to finish this
pandas has a function pivot_table that may help you
just like your post
df = pd.DataFrame({'In/Out': {0: 'In', 1: 'In', 2: 'Out', 3: 'In', 4: 'Out', 5: 'Out', 6: 'In', 7: 'In', 8: 'Out'}, 'Client': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 2, 6: 1, 7: 1, 8: 1}, 'Quality': {0: 'A', 1: 'B', 2: 'B', 3: 'C', 4: 'B', 5: 'C', 6: 'B', 7: 'A', 8: 'A'}, 'Date': {0: '16/08/2019', 1: '16/08/2019', 2: '16/08/2019', 3: '14/08/2019', 4: '14/08/2019', 5: '14/08/2019', 6: '18/08/2019', 7: '18/08/2019', 8: '18/08/2019'}, 'GoodsAmount': {0: 180, 1: 180, 2: 360, 3: 130, 4: 45, 5: 85, 6: 80, 7: 60, 8: 140}})
you should use:
pd.pivot_table(df, values='GoodsAmount', index=['Client','Date'],
columns=['In/Out','Quality'], aggfunc=np.sum)
and this will return the args are described by the docs above
来源:https://stackoverflow.com/questions/64060574/sumifs-in-python-jupyter