Pandas drop rows by time duration

前端 未结 1 1406
小鲜肉
小鲜肉 2021-01-16 22:22

I would like to drop dataframe rows by time condition (ignoring date). My data contains around 100 million rows. I have around 100 columns and each column has different samp

相关标签:
1条回答
  • 2021-01-16 22:58

    by df=df.loc['2018-01-01 00:00:00.000000 ':'2018-01-01 00:00:00.000500 '] you will have new df witch data are between 2018-01-01 00:00:00.000000 and 2018-01-01 00:00:00.000500 now you can apply you filter for desire dates

    import pandas as pd
    
    # leave_duration=0.01 seconds
    # drop_duration=0.1 seconds
    
    i = pd.date_range('2018-01-01', periods=1000, freq='2ms')
    i=i.append(pd.date_range('2018-01-01', periods=1000, freq='3ms'))
    i=i.append(pd.date_range('2018-01-01', periods=1000, freq='0.5ms'))
    df = pd.DataFrame({'A': range(len(i))}, index=i)
    df=df.sort_index()
    print(df)
    
    #filter data between 2018-01-01 00:00:00.000000 ':'2018-01-01 00:00:00.000500
    df=df.loc['2018-01-01 00:00:00.000000 ':'2018-01-01 00:00:00.000500 ']
    print(df)
    

    Output: Before data filter applied

                                   A
    2018-01-01 00:00:00.000000     0
    2018-01-01 00:00:00.000000  2000
    2018-01-01 00:00:00.000000  1000
    2018-01-01 00:00:00.000500  2001
    2018-01-01 00:00:00.001000  2002
    ...                          ...
    2018-01-01 00:00:02.985000  1995
    2018-01-01 00:00:02.988000  1996
    2018-01-01 00:00:02.991000  1997
    2018-01-01 00:00:02.994000  1998
    2018-01-01 00:00:02.997000  1999
    
    [3000 rows x 1 columns]
    

    After date filter applied:

    
                                   A
    2018-01-01 00:00:00.000000     0
    2018-01-01 00:00:00.000000  2000
    2018-01-01 00:00:00.000000  1000
    2018-01-01 00:00:00.000500  2001
    
    0 讨论(0)
提交回复
热议问题