how to sort by english date format not american pandas .sort()

前端 未结 3 1432
情话喂你
情话喂你 2020-12-21 18:01
    symb                dates
4     BLK  01/03/2014 09:00:00
0     BBR  02/06/2014 09:00:00
21     HZ  02/06/2014 09:00:00
24   OMNI  02/07/2014 09:00:00
31   NOTE           


        
相关标签:
3条回答
  • 2020-12-21 18:18

    You can use pandas.to_datetime and use the format argument then sort it.

    >> df['date'] = pd.to_datetime(df['date'], format='%m/%d/%Y %H:%M:%S')
    >> df.sort('date')
    
                   date    symb
    0 2014-01-03 09:00:00   BLK
    1 2014-02-06 09:00:00   BBR
    2 2014-02-06 09:00:00    HZ
    3 2014-02-07 09:00:00  OMNI
    4 2014-03-04 09:00:00  NOTE
    6 2014-04-07 09:00:00   RBY
    5 2016-03-04 09:00:00   AMP
    
    0 讨论(0)
  • 2020-12-21 18:23

    You can use to_datetime, for sorting sort_values:

    #format mm/dd/YYYY
    df['dates'] = pd.to_datetime(df['dates'])
    print (df.sort_values('dates'))
        symb               dates
    4    BLK 2014-01-03 09:00:00
    0    BBR 2014-02-06 09:00:00
    21    HZ 2014-02-06 09:00:00
    24  OMNI 2014-02-07 09:00:00
    31  NOTE 2014-03-04 09:00:00
    40   RBY 2014-04-07 09:00:00
    65   AMP 2016-03-04 09:00:00
    

    #format dd/mm/YYYY
    df['dates'] = pd.to_datetime(df['dates'], dayfirst=True)
    print (df.sort_values('dates'))
        symb               dates
    4    BLK 2014-03-01 09:00:00
    31  NOTE 2014-04-03 09:00:00
    0    BBR 2014-06-02 09:00:00
    21    HZ 2014-06-02 09:00:00
    24  OMNI 2014-07-02 09:00:00
    40   RBY 2014-07-04 09:00:00
    65   AMP 2016-04-03 09:00:00
    

    Another solution is use parameter parse_dates in read_csv, if format dd/mm/YYYY add dayfirst=True:

    import pandas as pd
    import numpy as np
    from pandas.compat import StringIO
    
    temp=u"""symb,dates
    BLK,01/03/2014 09:00:00
    BBR,02/06/2014 09:00:00
    HZ,02/06/2014 09:00:00
    OMNI,02/07/2014 09:00:00
    NOTE,03/04/2014 09:00:00
    AMP,03/04/2016 09:00:00
    RBY,04/07/2014 09:00:00"""
    #after testing replace 'StringIO(temp)' to 'filename.csv'
    df = pd.read_csv(StringIO(temp), parse_dates=['dates'])
    
    print (df)
       symb               dates
    0   BLK 2014-01-03 09:00:00
    1   BBR 2014-02-06 09:00:00
    2    HZ 2014-02-06 09:00:00
    3  OMNI 2014-02-07 09:00:00
    4  NOTE 2014-03-04 09:00:00
    5   AMP 2016-03-04 09:00:00
    6   RBY 2014-04-07 09:00:00
    
    print (df.dtypes)
    symb             object
    dates    datetime64[ns]
    dtype: object
    
    print (df.sort_values('dates'))
       symb               dates
    0   BLK 2014-01-03 09:00:00
    1   BBR 2014-02-06 09:00:00
    2    HZ 2014-02-06 09:00:00
    3  OMNI 2014-02-07 09:00:00
    4  NOTE 2014-03-04 09:00:00
    6   RBY 2014-04-07 09:00:00
    5   AMP 2016-03-04 09:00:00
    

    #after testing replace 'StringIO(temp)' to 'filename.csv'
    df = pd.read_csv(StringIO(temp), parse_dates=['dates'], dayfirst=True)
    
    print (df)
       symb               dates
    0   BLK 2014-03-01 09:00:00
    1   BBR 2014-06-02 09:00:00
    2    HZ 2014-06-02 09:00:00
    3  OMNI 2014-07-02 09:00:00
    4  NOTE 2014-04-03 09:00:00
    5   AMP 2016-04-03 09:00:00
    6   RBY 2014-07-04 09:00:00
    
    print (df.dtypes)
    symb             object
    dates    datetime64[ns]
    dtype: object
    
    print (df.sort_values('dates'))
       symb               dates
    0   BLK 2014-03-01 09:00:00
    4  NOTE 2014-04-03 09:00:00
    1   BBR 2014-06-02 09:00:00
    2    HZ 2014-06-02 09:00:00
    3  OMNI 2014-07-02 09:00:00
    6   RBY 2014-07-04 09:00:00
    5   AMP 2016-04-03 09:00:00
    
    0 讨论(0)
  • 2020-12-21 18:32

    I am not sure how you are getting the data, but if you are importing it from some source such as a CSV you could use pandas.read_csv and set parse_dates=True. The question is what is the type of the dates column? You an easily change them to datelike objects using `dateutil.parse.parse. For example,

    import pandas
    import dateutil
    data = {'symb': ['BLK', 'BBR', 'HZ', 'OMNI', 'NOTE', 'AMP', 'RBY'],
            'dates': ['01/03/2014 09:00:00', '02/06/2014 09:00:00', '02/06/2014 09:00:00',
                   '02/07/2014 09:00:00', '03/04/2014 09:00:00', '03/04/2016 09:00:00',
                   '04/07/2014 09:00:00']}
    df = pandas.DataFrame.from_dict(data)
    df.dates = df.dates.apply(dateutil.parser.parse)
    print df.to_string()
    
    # OUTPUT
    # 0 2014-01-03 09:00:00   BLK
    # 1 2014-02-06 09:00:00   BBR
    # 2 2014-02-06 09:00:00    HZ
    # 3 2014-02-07 09:00:00  OMNI
    # 4 2014-03-04 09:00:00  NOTE
    # 5 2016-03-04 09:00:00   AMP
    # 6 2014-04-07 09:00:00   RBY
    

    This gets you the [ISO8601 format] which may be preferable to the dd/mm/yyyy format, but if you must have that format you can use the code recommended by @umutto

    0 讨论(0)
提交回复
热议问题