问题
I have a multiindex dataframe. Index columns are Date
and Symbol
. I want to reset the row where the dataframe starts to evaluate rolling_max
of number
for each Symbol
. I want to do this based on a column containing True
or False
. If condition
is True
on a Date
then rolling_max
should be reset and calculate max from this Date
. If condition
is False
then rolling_max
should work 'normally' - taking the max of today's and yesterday's number
for the given Symbol
. The condition
column has nothing to do with the number
column (they do not depend on each other). This is the expected output:
number condition rolling_max
Date Symbol
1990-01-01 A 29 False 29
1990-01-01 B 7 False 7
1990-01-02 A 13 True 13 # Reset rolling max for 'A'
1990-01-02 B 2 False 7
1990-01-03 A 11 False 13
1990-01-03 B 52 True 52 # Reset rolling max for 'B'
1990-01-04 A 30 False 30
1990-01-04 B 1 False 52
1990-01-05 A 19 True 19 # Reset rolling max for 'A'
1990-01-05 B 65 False 65
1990-01-06 A 17 False 19
1990-01-06 B 20 True 20 # Reset rolling max for 'B'
How can I do this?
回答1:
I was able to solve this.
df['rolling_max'] = df.groupby(['Symbol',df.groupby('Symbol')['condition'].cumsum()])['number'].cummax()
来源:https://stackoverflow.com/questions/52651800/how-to-conditionally-reset-a-rolling-maxs-initial-value-row-in-pandas-multiinde