how to sort by english date format not american pandas .sort()

前端 未结 3 1430
情话喂你
情话喂你 2020-12-21 18:01
    symb                dates
4     BLK  01/03/2014 09:00:00
0     BBR  02/06/2014 09:00:00
21     HZ  02/06/2014 09:00:00
24   OMNI  02/07/2014 09:00:00
31   NOTE           


        
3条回答
  •  隐瞒了意图╮
    2020-12-21 18:23

    You can use to_datetime, for sorting sort_values:

    #format mm/dd/YYYY
    df['dates'] = pd.to_datetime(df['dates'])
    print (df.sort_values('dates'))
        symb               dates
    4    BLK 2014-01-03 09:00:00
    0    BBR 2014-02-06 09:00:00
    21    HZ 2014-02-06 09:00:00
    24  OMNI 2014-02-07 09:00:00
    31  NOTE 2014-03-04 09:00:00
    40   RBY 2014-04-07 09:00:00
    65   AMP 2016-03-04 09:00:00
    

    #format dd/mm/YYYY
    df['dates'] = pd.to_datetime(df['dates'], dayfirst=True)
    print (df.sort_values('dates'))
        symb               dates
    4    BLK 2014-03-01 09:00:00
    31  NOTE 2014-04-03 09:00:00
    0    BBR 2014-06-02 09:00:00
    21    HZ 2014-06-02 09:00:00
    24  OMNI 2014-07-02 09:00:00
    40   RBY 2014-07-04 09:00:00
    65   AMP 2016-04-03 09:00:00
    

    Another solution is use parameter parse_dates in read_csv, if format dd/mm/YYYY add dayfirst=True:

    import pandas as pd
    import numpy as np
    from pandas.compat import StringIO
    
    temp=u"""symb,dates
    BLK,01/03/2014 09:00:00
    BBR,02/06/2014 09:00:00
    HZ,02/06/2014 09:00:00
    OMNI,02/07/2014 09:00:00
    NOTE,03/04/2014 09:00:00
    AMP,03/04/2016 09:00:00
    RBY,04/07/2014 09:00:00"""
    #after testing replace 'StringIO(temp)' to 'filename.csv'
    df = pd.read_csv(StringIO(temp), parse_dates=['dates'])
    
    print (df)
       symb               dates
    0   BLK 2014-01-03 09:00:00
    1   BBR 2014-02-06 09:00:00
    2    HZ 2014-02-06 09:00:00
    3  OMNI 2014-02-07 09:00:00
    4  NOTE 2014-03-04 09:00:00
    5   AMP 2016-03-04 09:00:00
    6   RBY 2014-04-07 09:00:00
    
    print (df.dtypes)
    symb             object
    dates    datetime64[ns]
    dtype: object
    
    print (df.sort_values('dates'))
       symb               dates
    0   BLK 2014-01-03 09:00:00
    1   BBR 2014-02-06 09:00:00
    2    HZ 2014-02-06 09:00:00
    3  OMNI 2014-02-07 09:00:00
    4  NOTE 2014-03-04 09:00:00
    6   RBY 2014-04-07 09:00:00
    5   AMP 2016-03-04 09:00:00
    

    #after testing replace 'StringIO(temp)' to 'filename.csv'
    df = pd.read_csv(StringIO(temp), parse_dates=['dates'], dayfirst=True)
    
    print (df)
       symb               dates
    0   BLK 2014-03-01 09:00:00
    1   BBR 2014-06-02 09:00:00
    2    HZ 2014-06-02 09:00:00
    3  OMNI 2014-07-02 09:00:00
    4  NOTE 2014-04-03 09:00:00
    5   AMP 2016-04-03 09:00:00
    6   RBY 2014-07-04 09:00:00
    
    print (df.dtypes)
    symb             object
    dates    datetime64[ns]
    dtype: object
    
    print (df.sort_values('dates'))
       symb               dates
    0   BLK 2014-03-01 09:00:00
    4  NOTE 2014-04-03 09:00:00
    1   BBR 2014-06-02 09:00:00
    2    HZ 2014-06-02 09:00:00
    3  OMNI 2014-07-02 09:00:00
    6   RBY 2014-07-04 09:00:00
    5   AMP 2016-04-03 09:00:00
    

提交回复
热议问题