问题
I want to reassign the timestamps of a series of dates such that they get floored at a frequency of (e.g.) 3 days:
import pandas as pd
x = pd.date_range('01-01-2019', freq='1D', periods=7).floor('3D')
y = pd.date_range('01-01-2022', freq='1D', periods=7).floor('3D')
I am expecting the "floor" to align to the first date and produce:
In[3]: x
Out[3]:
DatetimeIndex(['2019-01-01', '2019-01-01', '2019-01-01', '2019-01-04',
'2019-01-04', '2019-01-04', '2019-01-07'],
dtype='datetime64[ns]', freq=None)
In[4]: y
Out[4]:
DatetimeIndex(['2022-01-01', '2022-01-01', '2022-01-01', '2022-01-04',
'2022-01-04', '2022-01-04', '2022-01-07'],
dtype='datetime64[ns]', freq=None)
But instead it seems like there is a 3 day cycle the dates are floored to (presumably multiples of 3 days since Jan 1 1970?), so instead the result is:
In[3]: x
Out[3]:
DatetimeIndex(['2018-12-30', '2019-01-02', '2019-01-02', '2019-01-02',
'2019-01-05', '2019-01-05', '2019-01-05'],
dtype='datetime64[ns]', freq=None)
In[4]: y
Out[4]:
DatetimeIndex(['2022-01-01', '2022-01-01', '2022-01-01', '2022-01-04',
'2022-01-04', '2022-01-04', '2022-01-07'],
dtype='datetime64[ns]', freq=None)
The results for x
start on December 30 instead of January 1.
Is there a way to set a "base" for the floor
operation in pandas? I say "base" because of the base
argument in resample for doing similar adjustments. But I don't want to do any aggregation, just keep each element but reassign the timestamp.
回答1:
x = pd.date_range('01-01-2019', freq='1D', periods=7)
y = pd.date_range('01-01-2022', freq='1D', periods=7)
def floor(x, freq):
offset = x[0].ceil(freq) - x[0]
return (x + offset).floor(freq) - offset
print(floor(x, '3D'))
print(floor(y, '3D'))
Output
DatetimeIndex(['2019-01-01', '2019-01-01', '2019-01-01', '2019-01-04',
'2019-01-04', '2019-01-04', '2019-01-07'],
dtype='datetime64[ns]', freq=None)
DatetimeIndex(['2022-01-01', '2022-01-01', '2022-01-01', '2022-01-04',
'2022-01-04', '2022-01-04', '2022-01-07'],
dtype='datetime64[ns]', freq=None)
Adding addition logic:
def floor(x, freq):
offset = x[0].ceil(freq) - x[0]
adj_needed = (offset != pd.Timedelta(0))
return (x + offset).floor(freq) - offset if adj_needed else x.floor(freq)
来源:https://stackoverflow.com/questions/62822739/how-to-set-the-base-of-a-datetime-floor-operation-in-pandas