Converting date formats in pandas dataframe

前端 未结 3 2014
感情败类
感情败类 2021-01-24 14:07

I have a dataframe and the Date column has two different types of date formats going on.

eg. 1983-11-10 00:00:00 and 10/11/1983

I want them all to b

3条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-24 14:42

    I believe you need parameter dayfirst=True in to_datetime:

    df = pd.DataFrame({'Date': {0: '1983-11-10 00:00:00', 1: '10/11/1983'}})
    print (df)
                      Date
    0  1983-11-10 00:00:00
    1           10/11/1983
    
    
    df['Date'] = pd.to_datetime(df.Date, dayfirst=True)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    

    because:

    df['Date'] = pd.to_datetime(df.Date)
    print (df)
            Date
    0 1983-11-10
    1 1983-10-11
    

    Or you can specify both formats and then use combine_first:

    d1 = pd.to_datetime(df.Date, format='%Y-%m-%d %H:%M:%S', errors='coerce')
    d2 = pd.to_datetime(df.Date, format='%d/%m/%Y', errors='coerce')
    
    df['Date'] = d1.combine_first(d2)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    

    General solution for multiple formats:

    from functools import reduce 
    
    def convert_formats_to_datetimes(col, formats):
        out = [pd.to_datetime(col, format=x, errors='coerce') for x in formats]
        return reduce(lambda l,r: pd.Series.combine_first(l,r), out)
    
    formats = ['%Y-%m-%d %H:%M:%S', '%d/%m/%Y']
    df['Date'] = df['Date'].pipe(convert_formats_to_datetimes, formats)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    

提交回复
热议问题