问题
My question is very similar to Cumsum within group and reset on condition in pandas and Pandas: cumsum per category based on additional condition but they don't quite get me there due to my conditional requirements. I have a data frame that looks like this:
TransactionId Delta
14 2
14 3
14 1
14 2
15 4
15 2
15 3
I want to create another column "Cumulative" that calculates the cumsum of Delta for each TransactionId. So the result would look like this:
TransactionId Delta Cumulative
14 2 2
14 3 5
14 1 6
14 2 8
15 4 4
15 2 6
15 3 9
I have the condition for checking TransactionId equality setup:
c1 = df.TransactionId.eq(df.TransactionId.shift())
But I can't figure out how to add the Delta value to the previous Cumulative row.
回答1:
Use groupby.cumsum:
df['Cumulative'] = df.groupby('TransactionId')['Delta'].cumsum()
print (df)
TransactionId Delta Cumulative
0 14 2 2
1 14 3 5
2 14 1 6
3 14 2 8
4 15 4 4
5 15 2 6
6 15 3 9
来源:https://stackoverflow.com/questions/53079313/cumsum-reset-based-on-a-condition-in-pandas