pandas 时间序列resample

心不动则不痛 提交于 2020-03-21 10:26:22


resample与groupby的区别:
resample:在给定的时间单位内重取样
groupby:对给定的数据条目进行统计

函数原型:
DataFrame.resample(rule, how=None, axis=0, fill_method=None, closed=None, label=None, convention='start', kind=None, loffset=None, limit=None, base=0)
其中,参数how已经废弃了


下面开始练习

import numpy as np
import pandas as pd

 
Start by creating a series with 9 one minute timestamps.

index = pd.date_range('1/1/2000', periods=9, freq='T')
series = pd.Series(range(9), index=index)

 
Downsample the series into 3 minute bins and sum the values of the timestamps falling into a bin.

series.resample('3T').sum()

 
To include this value close the right side of the bin interval as illustrated in the example below this one.

series.resample('3T', label='right').sum()


Downsample the series into 3 minute bins as above, but close the right side of the bin interval.

series.resample('3T', label='right', closed='right').sum()


Upsample the series into 30 second bins.

series.resample('30S').asfreq()

 
Upsample the series into 30 second bins and fill the NaN values using the pad method.

series.resample('30S').pad()

 
Upsample the series into 30 second bins and fill the NaN values using the bfill method.

series.resample('30S').bfill()

 
Pass a custom function via apply

def custom_resampler(array_like):
    return np.sum(array_like)+5

series.resample('3T').apply(custom_resampler)

 
附:常见时间频率
A year
M month
W week
D day
H hour
T minute
S second


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!