pandas dataframes resample over uneven periods / minutes

允我心安 提交于 2020-02-25 08:23:48

问题


searched for it but found no solution - if there is already one sry for asking but i would be thankful for a link

I have a dataframe (df) like this:

timestamp          value
2016-03-11 07:37:40 24.6018
2016-03-11 07:37:45 24.6075
2016-03-11 07:37:50 24.599
2016-03-11 07:37:55 24.6047
2016-03-11 07:38:00 24.5905
2016-03-11 07:38:05 24.551
...

important start not at a even minute like 07:40:00 but 07:37:40 (could be any time) and i want to resample it - calculate mean values over e.g. 5 minutes labeled with last timestamp of used lines. Desired result with first timestamp 2016-03-11 07:37:40 of raw data :

2016-03-11 07:42:40 24.608
2016-03-11 07:47:40 24.605
2016-03-11 07:52:40 24.59
...

i tried to use

df.resample('5T',how='mean',label='right')

and

df.resample('300S',how='mean',label='right')

with the same result:

2016-03-11 07:40:00 24.618
2016-03-11 07:45:00 24.675
2016-03-11 07:50:00 24.599
...

it calculates over full minute periods. I found no option to correct this propperly. Saw that "base" could be an option but it seems not very ituitive or nice coded.

Any help would be appreciated.


回答1:


Check this I used rolling which will roll over the given frequency and do the mathematical operations like sum,mean etc. In this you need to know the start and end datetime values.

Code:

df.timestamp=pd.to_datetime(df.timestamp)
df.set_index('timestamp',inplace=True)
df = df.rolling('15s').mean()
mask = pd.date_range('2016-03-11 07:37:40','2016-03-11 07:38:05',freq='10S')
df = df.loc[mask]
df

                        value
2016-03-11 07:37:40 24.601800
2016-03-11 07:37:50 24.602767
2016-03-11 07:38:00 24.598067

Use your desired window instead of '15s' in rolling as well as date_range that I used.Let me know this works for you.



来源:https://stackoverflow.com/questions/47432198/pandas-dataframes-resample-over-uneven-periods-minutes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!