Remove non-business days rows from pandas dataframe

前端 未结 4 1366
一向
一向 2020-12-10 10:36

I have a dataframe with second timeseries data of wheat in df.

df = wt[\"WHEAT_USD\"]

2016-05-02 02:00:00+02:00    4.780
2016-05-02 02:01:00+02         


        
相关标签:
4条回答
  • 2020-12-10 11:35

    Pandas BDay just ends up using .dayofweek<5 like the chosen answer, but can be extended to account for bank holidays, etc.

    import pandas as pd
    from pandas.tseries.offsets import BDay
    
    isBusinessDay = BDay().onOffset
    csv_path = 'C:\\Python27\\Lib\\site-packages\\bokeh\\sampledata\\daylight_warsaw_2013.csv'
    dates_df = pd.read_csv(csv_path)
    match_series = pd.to_datetime(dates_df['Date']).map(isBusinessDay)
    dates_df[match_series]
    
    0 讨论(0)
  • 2020-12-10 11:36

    I am building a backtester for stock/FX trading and I also have these issue with days that are nan because that they are holidays or other non trading days.. you can download a financial calendar for the days that there is no trading and then you need to think about timezone and weekends.. etc..

    But the best solution is not to use date/time as the index for the candles or price. So do not connect your price data to a date/time but just to a counter of candles or prices .. you can use a second index for this.. so for calculations of MA or other technical lines dont use date/time .. if you look at Metatrader 4/5 it also doesnt use date/time but the index of the data is the candle number !!

    I think that you need to let go of the date-time for the price if you work with stock or FX data , of cause you can put them in a column of the data-frame but dont use it as the index This way you can avoid many problems

    0 讨论(0)
  • 2020-12-10 11:36

    using workdays, you can count for holidays pretty easily

        import workdays as wd
    
        def drop_non_busdays(df, holidays=None):
            if holidays is None:
                holidays = []
            start_date = df.index.to_list()[0].date()
            end_date = df.index.to_list()[-1].date()
    
    
            start_wd = wd.workday(wd.workday(start_date, -1, holidays), 1, holidays)
            end_wd = wd.workday(wd.workday(end_date, 1, holidays), -1, holidays)
    
            b_days = [start_wd]
            while b_days[-1] < end_wd:
                b_days.append(wd.workday(b_days[-1], 1, holidays))
    
            valid = [i in b_days for i in df.index]
            return df[valid]
    
    0 讨论(0)
  • 2020-12-10 11:38

    One simple solution is to slice out the days not in Monday to Friday:

    In [11]: s[s.index.dayofweek < 5]
    Out[11]:
    2016-05-02 00:00:00    4.780
    2016-05-02 00:01:00    4.777
    2016-05-02 00:02:00    4.780
    2016-05-02 00:03:00    4.780
    2016-05-02 00:04:00    4.780
    Name: closeAsk, dtype: float64
    

    Note: this doesn't take into account bank holidays etc.

    0 讨论(0)
提交回复
热议问题