symb dates
4 BLK 01/03/2014 09:00:00
0 BBR 02/06/2014 09:00:00
21 HZ 02/06/2014 09:00:00
24 OMNI 02/07/2014 09:00:00
31 NOTE
You can use pandas.to_datetime and use the format argument then sort it.
>> df['date'] = pd.to_datetime(df['date'], format='%m/%d/%Y %H:%M:%S')
>> df.sort('date')
date symb
0 2014-01-03 09:00:00 BLK
1 2014-02-06 09:00:00 BBR
2 2014-02-06 09:00:00 HZ
3 2014-02-07 09:00:00 OMNI
4 2014-03-04 09:00:00 NOTE
6 2014-04-07 09:00:00 RBY
5 2016-03-04 09:00:00 AMP
You can use to_datetime, for sorting sort_values:
#format mm/dd/YYYY
df['dates'] = pd.to_datetime(df['dates'])
print (df.sort_values('dates'))
symb dates
4 BLK 2014-01-03 09:00:00
0 BBR 2014-02-06 09:00:00
21 HZ 2014-02-06 09:00:00
24 OMNI 2014-02-07 09:00:00
31 NOTE 2014-03-04 09:00:00
40 RBY 2014-04-07 09:00:00
65 AMP 2016-03-04 09:00:00
#format dd/mm/YYYY
df['dates'] = pd.to_datetime(df['dates'], dayfirst=True)
print (df.sort_values('dates'))
symb dates
4 BLK 2014-03-01 09:00:00
31 NOTE 2014-04-03 09:00:00
0 BBR 2014-06-02 09:00:00
21 HZ 2014-06-02 09:00:00
24 OMNI 2014-07-02 09:00:00
40 RBY 2014-07-04 09:00:00
65 AMP 2016-04-03 09:00:00
Another solution is use parameter parse_dates
in read_csv, if format dd/mm/YYYY
add dayfirst=True
:
import pandas as pd
import numpy as np
from pandas.compat import StringIO
temp=u"""symb,dates
BLK,01/03/2014 09:00:00
BBR,02/06/2014 09:00:00
HZ,02/06/2014 09:00:00
OMNI,02/07/2014 09:00:00
NOTE,03/04/2014 09:00:00
AMP,03/04/2016 09:00:00
RBY,04/07/2014 09:00:00"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), parse_dates=['dates'])
print (df)
symb dates
0 BLK 2014-01-03 09:00:00
1 BBR 2014-02-06 09:00:00
2 HZ 2014-02-06 09:00:00
3 OMNI 2014-02-07 09:00:00
4 NOTE 2014-03-04 09:00:00
5 AMP 2016-03-04 09:00:00
6 RBY 2014-04-07 09:00:00
print (df.dtypes)
symb object
dates datetime64[ns]
dtype: object
print (df.sort_values('dates'))
symb dates
0 BLK 2014-01-03 09:00:00
1 BBR 2014-02-06 09:00:00
2 HZ 2014-02-06 09:00:00
3 OMNI 2014-02-07 09:00:00
4 NOTE 2014-03-04 09:00:00
6 RBY 2014-04-07 09:00:00
5 AMP 2016-03-04 09:00:00
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), parse_dates=['dates'], dayfirst=True)
print (df)
symb dates
0 BLK 2014-03-01 09:00:00
1 BBR 2014-06-02 09:00:00
2 HZ 2014-06-02 09:00:00
3 OMNI 2014-07-02 09:00:00
4 NOTE 2014-04-03 09:00:00
5 AMP 2016-04-03 09:00:00
6 RBY 2014-07-04 09:00:00
print (df.dtypes)
symb object
dates datetime64[ns]
dtype: object
print (df.sort_values('dates'))
symb dates
0 BLK 2014-03-01 09:00:00
4 NOTE 2014-04-03 09:00:00
1 BBR 2014-06-02 09:00:00
2 HZ 2014-06-02 09:00:00
3 OMNI 2014-07-02 09:00:00
6 RBY 2014-07-04 09:00:00
5 AMP 2016-04-03 09:00:00
I am not sure how you are getting the data, but if you are importing it from some source such as a CSV you could use pandas.read_csv and set parse_dates=True
. The question is what is the type of the dates column? You an easily change them to datelike objects using `dateutil.parse.parse. For example,
import pandas
import dateutil
data = {'symb': ['BLK', 'BBR', 'HZ', 'OMNI', 'NOTE', 'AMP', 'RBY'],
'dates': ['01/03/2014 09:00:00', '02/06/2014 09:00:00', '02/06/2014 09:00:00',
'02/07/2014 09:00:00', '03/04/2014 09:00:00', '03/04/2016 09:00:00',
'04/07/2014 09:00:00']}
df = pandas.DataFrame.from_dict(data)
df.dates = df.dates.apply(dateutil.parser.parse)
print df.to_string()
# OUTPUT
# 0 2014-01-03 09:00:00 BLK
# 1 2014-02-06 09:00:00 BBR
# 2 2014-02-06 09:00:00 HZ
# 3 2014-02-07 09:00:00 OMNI
# 4 2014-03-04 09:00:00 NOTE
# 5 2016-03-04 09:00:00 AMP
# 6 2014-04-07 09:00:00 RBY
This gets you the [ISO8601 format] which may be preferable to the dd/mm/yyyy
format, but if you must have that format you can use the code recommended by @umutto