问题
I'm trying to grab a 30 day window going backwards from all dates in a dataframe but also look at the same 30 day window across all of the years in the dataset. The dates are from 2000-2019. For for example starting on 1st Feb 2000, I would like to grab the previous 30 days, and the 30 days before 1st Feb in all other years.
I can get a rolling window to work over n days for a z-score:
dt= pd.date_range(start='2000-01-01', end='2019-03-01')
x=[randint(0,100) for x in range(len(dt))]
DTX = pd.DataFrame({'X': x}, index=dt)
def zscore(x, window):
""" calculate z-score across a window (assumes normal distribution) """
r = x.rolling(window=window)
m = r.mean().shift(1)
s = r.std(ddof=0).shift(1)
z = (x-m)/s
return z
DTX['Z'] = zscore(DTX['X'], 30)
Or a rank for the window:
def ranked_percent(col, window):
""" rank values in a window as a decimal (highest=1)"""
pctrank = lambda x: pd.Series(x).rank(pct=True).iloc[-1]
rollingrank=col.rolling(window=window,raw=False).apply(pctrank)
return rollingrank
DTX['Rank'] = ranked_percent(DTX['X'], 30)
I was wondering about maybe using groupby and grouper but have no idea how to implement it? - Not wedded to this though, any (fairly vectorized/fast) python solution would help. I really need to extend this over all of the years in the dataset. I would appreciate any help?? Many thanks
来源:https://stackoverflow.com/questions/55104775/pandas-get-30-day-rolling-window-over-n-years