Pandas resample time series counting backwards (or reverse resample)

*爱你&永不变心* 提交于 2019-12-24 00:22:11

问题


I want to resample a pandas time series counting backwards. For example, let's set up a simple time series of 11 days:

>>> index = pd.date_range('01-01-2018', '01-11-2018', freq='D')
>>> randint = np.random.randint(low=0, high=9, size=(len(index), 1))

>>> df = pd.DataFrame(randint, index=index, columns=['random'])
>>> print(df)

            random
2018-01-01       8
2018-01-02       8
2018-01-03       1
2018-01-04       4
2018-01-05       3
2018-01-06       5
2018-01-07       2
2018-01-08       6
2018-01-09       5
2018-01-10       1
2018-01-11       3

Default pandas behavior

If I resample it every 5 days, I'd get:

>>> df_5d = df.resample('5D').sum()
>>> print(df_5d)

            random
2018-01-01      24
2018-01-06      19
2018-01-11       3

Basically you have 3 groupings: the first two groups have 5 members and the last group has 1, for a total of 11 members overall:

Start        End
2018-01-01   2018-01-05
2018-01-06   2018-01-10
2018-01-11   2018-01-11

What I want is this

>>> df_5d = df.resample('5D').sum()
>>> print(df_5d)

            random
2018-01-01       8
2018-01-02      21
2018-01-07      17

And the groupings are shown below. See how I counted '5D' backwards starting from the latest date:

Start        End
2018-01-01   2018-01-01
2018-01-02   2018-01-06
2018-01-07   2018-01-11

How do I resample a pandas time series counting backwards?


回答1:


A workaround could be to divise your original df in two, to be able to use the standard resampling, then pd.concat both resampled dataframes, such as:

res_interval = 5
df_res = pd.concat([df[:len(df)%res_interval].resample('{}D'.format(res_interval)).sum(),
                    df[len(df)%res_interval:].resample('{}D'.format(res_interval)).sum()])

and with my random number, I get:

            random
2018-01-01       1
2018-01-02      13
2018-01-07      26



回答2:


You could use

In [452]: t = np.arange(len(df.index)-1, -1, -1) // 5

In [453]: df.reset_index().groupby(t, sort=False)['index'].agg([min, max])
Out[453]:
         min        max
2 2018-01-01 2018-01-01
1 2018-01-02 2018-01-06
0 2018-01-07 2018-01-11


来源:https://stackoverflow.com/questions/51787945/pandas-resample-time-series-counting-backwards-or-reverse-resample

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!