Filtering pandas dataframe by day

后端 未结 1 414
挽巷
挽巷 2021-01-12 11:18

I have a pandas data frame with forex data by minutes, one year long (371635 rows):

                           O        H        L        C
0                         


        
相关标签:
1条回答
  • 2021-01-12 11:44

    Avoid Python datetime

    First you should avoid combining Python datetime with Pandas operations. There are many Pandas / NumPy friendly methods to create datetime objects for comparison, e.g. pd.Timestamp and pd.to_datetime. Your performance issues here are partly due to this behaviour described in the docs:

    pd.Series.dt.date returns an array of python datetime.date objects

    Using object dtype in this way removes vectorisation benefits, as operations then require Python-level loops.

    Use groupby operations for aggregating by date

    Pandas already has functionality to group by date via normalizing time:

    for day, df_day in df.groupby(df.index.floor('d')):
        df_day_t = df_day.between_time('08:30', '09:30')
        # do something
    

    As another example, you can access a slice for a particular day in this way:

    g = df.groupby(df.index.floor('d'))
    my_day = pd.Timestamp('2017-01-01')
    df_slice = g.get_group(my_day)
    
    0 讨论(0)
提交回复
热议问题