问题
Using pandas
it is easy to create a monthly series of dates.
import pandas as pd
pd.date_range('2012-04-23', '2013-01-23', freq='BM')
DatetimeIndex(['2012-04-30', '2012-05-31', '2012-06-29', '2012-07-31',
'2012-08-31', '2012-09-28', '2012-10-31', '2012-11-30',
'2012-12-31'],
dtype='datetime64[ns]', freq='BM')
Notice that the dates in the DatetimeIndex
are month ends. I know that it should be considering I chose freq='BM'
but I don't believe I had a choice that would have accomplished my goal.
I'm often in need for producing a monthly series of dates starting from the last business day going back in time every month.
I'd like to see this instead:
DatetimeIndex(['2012-04-23', '2012-05-23', '2012-06-23', '2012-07-23',
'2012-08-23', '2012-09-23', '2012-10-23', '2012-11-23',
'2012-12-23'],
dtype='datetime64[ns]', freq=None)
or another more complicated example might be to get months from '2012-01-30' to '2012-04-30'. I'd expect to see:
DatetimeIndex(['2012-01-30', '2012-02-29', '2012-03-30', '2012-04-30'],
dtype='datetime64[ns]', freq=None)
回答1:
You may be looking for something like this:
from pandas.tseries.offsets import Day, BDay
pd.date_range(start = '2012-01-01', periods = 6, freq = 'MS') + Day(22) + BDay(0)
Out[12]:
DatetimeIndex(['2012-01-23', '2012-02-23', '2012-03-23', '2012-04-23',
'2012-05-23', '2012-06-25'],
dtype='datetime64[ns]', freq=None)
Day(22)
adds an offset of 22 days and BDay
is responsible for business day offset (BDay(0)
takes the nearest business day).
It's a bit more difficult with dates starting at 30th. So I had to write a function for this. (For clarity of code it doesn't allow a custom freq
parameter.)
def my_business_date_range(day, **kwargs):
assert(isinstance(day, int) & (day > 0) & (day < 32))
rng0 = pd.date_range(freq = 'MS', **kwargs)
rng1 = rng0 + pd.tseries.offsets.Day(day-1) + pd.tseries.offsets.BDay(0)
# Correcting overflows:
overflow_idx, = np.nonzero(rng0.month != rng1.month)
if overflow_idx.size > 0:
# rng1 is not mutable
tmp = rng1.tolist()
bme = pd.tseries.offsets.BusinessMonthEnd(-1)
for i in overflow_idx:
tmp[i] = bme(rng1[i])
rng1 = pd.DatetimeIndex(tmp)
return rng1
my_business_date_range(30, start= '2012-01-01', periods = 6)
Out[13]:
DatetimeIndex(['2012-01-30', '2012-02-29', '2012-03-30', '2012-04-30',
'2012-05-30', '2012-06-29'],
dtype='datetime64[ns]', freq=None)
Pandas has also an experimental CustomBusinessMonth and the like but I couldn't make it work.
回答2:
I'm not clear on your question, but believe this is a move in the right direction.
start = '2012-04-23'
end = '2013-01-23'
>>> pd.DatetimeIndex([pd.datetime(ts.year, ts.month, int(end.split("-")[-1]))
for ts in pd.date_range(start, end, freq='BM')])
DatetimeIndex(['2012-04-23', '2012-05-23', '2012-06-23', '2012-07-23', '2012-08-23', '2012-09-23', '2012-10-23', '2012-11-23', '2012-12-23'], dtype='datetime64[ns]', freq=None)
Although not optimized for speed, I believe the following function will return the correct values per your requirements.
def foo(date, periods, forward=True):
if isinstance(date, str):
date = pd.Timestamp(date).date()
dates = [date + relativedelta(date, months=n * (1 if forward else -1)) for n in range(1, periods +1)]
result = []
print dates
for date in dates:
month = date.month
iso_day = date.isoweekday()
if iso_day == 6:
date += dt.timedelta(days=2 if forward else -1)
elif iso_day == 7:
date += dt.timedelta(days=1 if forward else -2)
if date.month != month:
# Gone into next/preceding month. Roll back/forward.
date -= dt.timedelta(days=3 if forward else -3)
result.append(date)
return result
来源:https://stackoverflow.com/questions/37084925/create-monthly-time-series-with-arbitrary-start-dates