问题
I have a pandas dataframe df
which has one column constituted by datetime64
, e.g.
<class 'pandas.core.frame.DataFrame'>
Int64Index: 1471 entries, 0 to 2940
Data columns (total 2 columns):
date 1471 non-null values
id 1471 non-null values
dtypes: datetime64[ns](1), int64(1)
I would like to sub-sample df
using as criterion the hour of the day (independently on the other information in date
). E.g., in pseudo code
df_sub = df[ (HOUR(df.date) > 8) & (HOUR(df.date) < 20) ]
for some function HOUR
.
I guess the problem can be solved via a preliminary conversion from datetime64
to datetime
. Can this be handled more efficiently?
回答1:
Found a simple solution.
df['hour'] = df.date.apply(lambda x : x.hour)
df_sub = df[(df.hour > 8) & (df.hour) <20]
EDIT:
There is a property dt
specifically introduced to handle this problem. The query becomes:
df_sub = df[ (df.date.dt.hour > 8)
& (df.date.dt.hour < 20) ]
来源:https://stackoverflow.com/questions/21624217/pandas-dataframe-with-a-datetime64-column-querying-by-hour