I have a pandas.DataFrame
called df
which has an automatically generated index, with a column dt
:
df[\'dt\'].dtype, df
In pandas 0.18.0 and later, there are datetime floor, ceil and round methods to round timestamps to a given fixed precision/frequency. To round down to hour precision, you can use:
>>> df['dt2'] = df['dt'].dt.floor('h')
>>> df
dt dt2
0 2014-10-01 10:02:45 2014-10-01 10:00:00
1 2014-10-01 13:08:17 2014-10-01 13:00:00
2 2014-10-01 17:39:24 2014-10-01 17:00:00
Here's another alternative to truncate the timestamps. Unlike floor
, it supports truncating to a precision such as year or month.
You can temporarily adjust the precision unit of the underlying NumPy datetime64
datatype, changing it from [ns]
to [h]
:
df['dt'].values.astype('<M8[h]')
This truncates everything to hour precision. For example:
>>> df
dt
0 2014-10-01 10:02:45
1 2014-10-01 13:08:17
2 2014-10-01 17:39:24
>>> df['dt2'] = df['dt'].values.astype('<M8[h]')
>>> df
dt dt2
0 2014-10-01 10:02:45 2014-10-01 10:00:00
1 2014-10-01 13:08:17 2014-10-01 13:00:00
2 2014-10-01 17:39:24 2014-10-01 17:00:00
>>> df.dtypes
dt datetime64[ns]
dt2 datetime64[ns]
The same method should work for any other unit: months 'M'
, minutes 'm'
, and so on:
'<M8[Y]'
'<M8[M]'
'<M8[D]'
'<M8[m]'
'<M8[s]'
A method I've used in the past to accomplish this goal was the following (quite similar to what you're already doing, but thought I'd throw it out there anyway):
df['dt2'] = df['dt'].apply(lambda x: x.replace(minute=0, second=0))