问题
I want to resample a pandas time series counting backwards. For example, let's set up a simple time series of 11 days:
>>> index = pd.date_range('01-01-2018', '01-11-2018', freq='D')
>>> randint = np.random.randint(low=0, high=9, size=(len(index), 1))
>>> df = pd.DataFrame(randint, index=index, columns=['random'])
>>> print(df)
random
2018-01-01 8
2018-01-02 8
2018-01-03 1
2018-01-04 4
2018-01-05 3
2018-01-06 5
2018-01-07 2
2018-01-08 6
2018-01-09 5
2018-01-10 1
2018-01-11 3
Default pandas behavior
If I resample it every 5 days, I'd get:
>>> df_5d = df.resample('5D').sum()
>>> print(df_5d)
random
2018-01-01 24
2018-01-06 19
2018-01-11 3
Basically you have 3 groupings: the first two groups have 5 members and the last group has 1, for a total of 11 members overall:
Start End
2018-01-01 2018-01-05
2018-01-06 2018-01-10
2018-01-11 2018-01-11
What I want is this
>>> df_5d = df.resample('5D').sum()
>>> print(df_5d)
random
2018-01-01 8
2018-01-02 21
2018-01-07 17
And the groupings are shown below. See how I counted '5D'
backwards starting from the latest date:
Start End
2018-01-01 2018-01-01
2018-01-02 2018-01-06
2018-01-07 2018-01-11
How do I resample a pandas time series counting backwards?
回答1:
A workaround could be to divise your original df
in two, to be able to use the standard resampling, then pd.concat
both resampled dataframes, such as:
res_interval = 5
df_res = pd.concat([df[:len(df)%res_interval].resample('{}D'.format(res_interval)).sum(),
df[len(df)%res_interval:].resample('{}D'.format(res_interval)).sum()])
and with my random number, I get:
random
2018-01-01 1
2018-01-02 13
2018-01-07 26
回答2:
You could use
In [452]: t = np.arange(len(df.index)-1, -1, -1) // 5
In [453]: df.reset_index().groupby(t, sort=False)['index'].agg([min, max])
Out[453]:
min max
2 2018-01-01 2018-01-01
1 2018-01-02 2018-01-06
0 2018-01-07 2018-01-11
来源:https://stackoverflow.com/questions/51787945/pandas-resample-time-series-counting-backwards-or-reverse-resample