I have a Pandas data frame, one of the column contains date strings in the format YYYY-MM-DD
For e.g. \'2013-10-28\'
At the moment th
Essentially equivalent to @waitingkuo, but I would use to_datetime
here (it seems a little cleaner, and offers some additional functionality e.g. dayfirst
):
In [11]: df
Out[11]:
a time
0 1 2013-01-01
1 2 2013-01-02
2 3 2013-01-03
In [12]: pd.to_datetime(df['time'])
Out[12]:
0 2013-01-01 00:00:00
1 2013-01-02 00:00:00
2 2013-01-03 00:00:00
Name: time, dtype: datetime64[ns]
In [13]: df['time'] = pd.to_datetime(df['time'])
In [14]: df
Out[14]:
a time
0 1 2013-01-01 00:00:00
1 2 2013-01-02 00:00:00
2 3 2013-01-03 00:00:00
Handling ValueError
s
If you run into a situation where doing
df['time'] = pd.to_datetime(df['time'])
Throws a
ValueError: Unknown string format
That means you have invalid (non-coercible) values. If you are okay with having them converted to pd.NaT
, you can add an errors='coerce'
argument to to_datetime
:
df['time'] = pd.to_datetime(df['time'], errors='coerce')
Try to convert one of the rows into timestamp using the pd.to_datetime function and then use .map to map the formular to the entire column
Now you can do df['column'].dt.date
Note that for datetime objects, if you don't see the hour when they're all 00:00:00, that's not pandas. That's iPython notebook trying to make things look pretty.
Another way to do this and this works well if you have multiple columns to convert to datetime.
cols = ['date1','date2']
df[cols] = df[cols].apply(pd.to_datetime)