slice pandas timeseries on date +/- 2 business days

后端 未结 2 1077
野趣味
野趣味 2021-02-10 09:15

having following timeseries:

In [65]: p
Out[65]: 
Date
2008-06-02    125.20
2008-06-03    124.47
2008-06-04    124.40
2008-06-05    126.89
2008-06-06    122.84
2         


        
相关标签:
2条回答
  • 2021-02-10 10:13

    Pandas has some pretty nice business day functionality built in that will handle this automatically. For this exact problem, it actually ends up being a bit more code, but it will handle a much more general case very easily.

    In [1]: ind = pd.date_range('2008-06-02', '2008-06-12', freq='B')
    
    In [2]: p = pd.Series(np.random.random(len(ind)), index=ind)
    
    In [3]: p
    Out[3]:
    2008-06-02    0.606132
    2008-06-03    0.328327
    2008-06-04    0.842873
    2008-06-05    0.272547
    2008-06-06    0.013640
    2008-06-09    0.357935
    2008-06-10    0.517029
    2008-06-11    0.992851
    2008-06-12    0.053158
    Freq: B, dtype: float64
    
    In [4]: t0 = pd.Timestamp('2008-6-6')
    
    In [5]: from pandas.tseries import offsets
    
    In [6]: delta = offsets.BDay(2)
    

    This will create a two business day offset. You can also make arbitrary offsets of other time units, or even combinations of time units. Now with the starting point and delta, you can slice intelligently in the standard way:

    In [7]: p[t0 - delta:t0 + delta]
    Out[7]:
    2008-06-04    0.842873
    2008-06-05    0.272547
    2008-06-06    0.013640
    2008-06-09    0.357935
    2008-06-10    0.517029
    Freq: B, dtype: float64
    

    The nice thing about this approach is that the interval is not linked to the number of rows. So, for instance, if you had hourly data and maybe some missing points, you could still capture two business days exactly the same way. Or if your data source happened to have weekend data in it as well but you still wanted +/- 2 business days.

    0 讨论(0)
  • 2021-02-10 10:14

    You could use the index method get_loc, and then slice:

    d = pd.to_datetime('2008-06-06')
    loc = s.index.get_loc(d)
    
    In [12]: loc
    Out[12]: 4
    
    In [13]: s[loc-2:loc+3]
    Out[13]: 
    2008-06-04    124.40
    2008-06-05    126.89
    2008-06-06    122.84
    2008-06-09    123.14
    2008-06-10    122.53
    Name: SPY
    

    .

    If you were just interested in those within two days:

    In [14]: dt = datetime.timedelta(1)
    
    In [15]: s[d - 2*dt:d + 2*dt]
    Out[15]: 
    2008-06-04    124.40
    2008-06-05    126.89
    2008-06-06    122.84
    Name: SPY
    
    0 讨论(0)
提交回复
热议问题