Pandas - automatically detect date columns **at run time**

前端 未结 2 1145
清酒与你
清酒与你 2021-01-04 08:19

I was wondering if pandas is capable of automatically detecting which columns are datetime objects and read those columns in as dates instead of strings?

I am looki

相关标签:
2条回答
  • 2021-01-04 09:13

    I would use pd.to_datetime, and catch exceptions on columns that don't work. For example:

    import pandas as pd
    
    df = pd.read_csv('test.csv')
    
    for col in df.columns:
        if df[col].dtype == 'object':
            try:
                df[col] = pd.to_datetime(df[col])
            except ValueError:
                pass
    
    df.dtypes
    # (object, datetime64[ns], int64)
    

    I believe this is as close to "automatic" as you can get for this application.

    0 讨论(0)
  • 2021-01-04 09:13

    You can avoid a for loop and use the parameter errors='ignore' to avoid modifying unwanted values. In the code below we apply a to_datetime transformation (ignoring errors) on all object columns (other columns are returned as is).

    If ‘ignore’, then invalid parsing will return the input

    df = df.apply(lambda col: pd.to_datetime(col, errors='ignore') 
                  if col.dtypes == object 
                  else col, 
                  axis=0)
    
    df.dtypes
    
    # 0            object
    # 1    datetime64[ns]
    # 2             int64
    
    0 讨论(0)
提交回复
热议问题