How to include end date in pandas date_range method?

前端 未结 7 1680
[愿得一人]
[愿得一人] 2021-01-02 02:42

From pd.date_range(\'2016-01\', \'2016-05\', freq=\'M\', ).strftime(\'%Y-%m\'), the last month is 2016-04, but I was expecting it to be 2016

相关标签:
7条回答
  • 2021-01-02 03:04

    You can use .union to add the next logical value after initializing the date_range. It should work as written for any frequency:

    d = pd.date_range('2016-01', '2016-05', freq='M')
    d = d.union([d[-1] + 1]).strftime('%Y-%m')
    

    Alternatively, you can use period_range instead of date_range. Depending on what you intend to do, this might not be the right thing to use, but it satisfies your question:

    pd.period_range('2016-01', '2016-05', freq='M').strftime('%Y-%m')
    

    In either case, the resulting output is as expected:

    ['2016-01' '2016-02' '2016-03' '2016-04' '2016-05']
    
    0 讨论(0)
  • 2021-01-02 03:05

    Include the day when specifying the dates in date_range call

    pd.date_range('2016-01-31', '2016-05-31', freq='M', ).strftime('%Y-%m')
    
    array(['2016-01', '2016-02', '2016-03', '2016-04', '2016-05'], 
          dtype='|S7')
    
    0 讨论(0)
  • 2021-01-02 03:22

    I had a similar problem when using datetime objects in dataframe. I would set the boundaries through .min() and .max() functions and then fill in missing dates using the pd.date_range function. Unfortunately the returned list/df was missing the maximum value.

    I found two work arounds for this:

    1) Add "closed = None" parameter in the pd.date_range function. This worked in the example below; however, it didn't work for me when working only with dataframes (no idea why).

    2) If option #1 doesn't work then you can add one extra unit (in this case a day) using the datetime.timedelta() function. In the case below it over indexed by a day but it can work for you if the date_range function isn't giving you the full range.

    import pandas as pd
    import datetime as dt 
    
    #List of dates as strings
    time_series = ['2020-01-01', '2020-01-03', '2020-01-5', '2020-01-6', '2020-01-7']
    
    #Creates dataframe with time data that is converted to datetime object 
    raw_data_df = pd.DataFrame(pd.to_datetime(time_series), columns = ['Raw_Time_Series'])
    
    #Creates an indexed_time list that includes missing dates and the full time range
    
    #Option No. 1 is to use the closed = None parameter choice. 
    indexed_time = pd.date_range(start = raw_data_df.Raw_Time_Series.min(),end = raw_data_df.Raw_Time_Series.max(),freq='D',closed= None)
    print('indexed_time option #! = ', indexed_time)
    
    #Option No. 2 if the function allows you to extend the time by one unit (in this case day) 
    #by using the datetime.timedelta function to get what you need. 
    indexed_time = pd.date_range(start = raw_data_df.Raw_Time_Series.min(),end = raw_data_df.Raw_Time_Series.max()+dt.timedelta(days=1),freq='D')
    print('indexed_time option #2 = ', indexed_time)
    
    #In this case you over index by an extra day because the date_range function works properly
    #However, if the "closed = none" parameters doesn't extend through the full range then this is a good work around 
    
    0 讨论(0)
  • 2021-01-02 03:24

    A way to do it without messing with figuring out month ends yourself.

    pd.date_range(*(pd.to_datetime(['2016-01', '2016-05']) + pd.offsets.MonthEnd()), freq='M')
    
    DatetimeIndex(['2016-01-31', '2016-02-29', '2016-03-31', '2016-04-30',
               '2016-05-31'],
              dtype='datetime64[ns]', freq='M')
    
    0 讨论(0)
  • 2021-01-02 03:27

    For the later crowd. You can also try to use the Month-Start frequency.

    >>> pd.date_range('2016-01', '2016-05', freq='MS', format = "%Y-%m" )
    DatetimeIndex(['2016-01-01', '2016-02-01', '2016-03-01', '2016-04-01',
                   '2016-05-01'],
                  dtype='datetime64[ns]', freq='MS')
    
    0 讨论(0)
  • 2021-01-02 03:27

    I dont think so. You need to add the (n+1) boundary

       pd.date_range('2016-01', '2016-06', freq='M' ).strftime('%Y-%m')
    

    The start and end dates are strictly inclusive. So it will not generate any dates outside of those dates if specified. http://pandas.pydata.org/pandas-docs/stable/timeseries.html

    Either way, you have to manually add some information. I believe adding just one more month is not a lot of work.

    0 讨论(0)
提交回复
热议问题