I was wondering if pandas is capable of automatically detecting which columns are datetime objects and read those columns in as dates instead of strings?
I am looki
I would use pd.to_datetime
, and catch exceptions on columns that don't work. For example:
import pandas as pd
df = pd.read_csv('test.csv')
for col in df.columns:
if df[col].dtype == 'object':
try:
df[col] = pd.to_datetime(df[col])
except ValueError:
pass
df.dtypes
# (object, datetime64[ns], int64)
I believe this is as close to "automatic" as you can get for this application.
You can avoid a for
loop and use the parameter errors='ignore'
to avoid modifying unwanted values. In the code below we apply a to_datetime
transformation (ignoring errors) on all object columns (other columns are returned as is).
If ‘ignore’, then invalid parsing will return the input
df = df.apply(lambda col: pd.to_datetime(col, errors='ignore')
if col.dtypes == object
else col,
axis=0)
df.dtypes
# 0 object
# 1 datetime64[ns]
# 2 int64