Converting date formats in pandas dataframe

前端 未结 3 2013
感情败类
感情败类 2021-01-24 14:07

I have a dataframe and the Date column has two different types of date formats going on.

eg. 1983-11-10 00:00:00 and 10/11/1983

I want them all to b

相关标签:
3条回答
  • 2021-01-24 14:36

    Input date is NSECODE Date Close 1 NSE500 20000103 1291.5500 2 NSE500 20000104 1335.4500 3 NSE500 20000105 1303.8000

    history_nseindex_df["Date"] = pd.to_datetime(history_nseindex_df["Date"])
    history_nseindex_df["Date"] = history_nseindex_df["Date"].dt.strftime("%Y-%m-%d")
    

    ouput is now NSECode Date Close 1 NSE500 2000-01-03 1291.5500 2 NSE500 2000-01-04 1335.4500 3 NSE500 2000-01-05 1303.8000

    0 讨论(0)
  • 2021-01-24 14:40

    I want them all to be the same type, how can I iterate through the Date column of my dataframe and convert the dates to one format?

    Your input data is ambiguous: is 10 / 11 10th November or 11th October? You need to specify logic to determine which is appropriate. A function is useful if you with to try multiple date formats sequentially:

    def date_apply_formats(s, form_lst):
        s = pd.to_datetime(s, format=form_lst[0], errors='coerce')
        for form in form_lst[1:]:
            s = s.fillna(pd.to_datetime(s, format=form, errors='coerce'))
        return s
    
    df['Date'] = date_apply_formats(df['Date'], ['%Y-%m-%d %H:%M:%S', '%d/%m/%Y'])
    

    Priority is given to the first item in form_lst. The solution is extendible to an arbitrary number of provided formats.

    0 讨论(0)
  • 2021-01-24 14:42

    I believe you need parameter dayfirst=True in to_datetime:

    df = pd.DataFrame({'Date': {0: '1983-11-10 00:00:00', 1: '10/11/1983'}})
    print (df)
                      Date
    0  1983-11-10 00:00:00
    1           10/11/1983
    
    
    df['Date'] = pd.to_datetime(df.Date, dayfirst=True)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    

    because:

    df['Date'] = pd.to_datetime(df.Date)
    print (df)
            Date
    0 1983-11-10
    1 1983-10-11
    

    Or you can specify both formats and then use combine_first:

    d1 = pd.to_datetime(df.Date, format='%Y-%m-%d %H:%M:%S', errors='coerce')
    d2 = pd.to_datetime(df.Date, format='%d/%m/%Y', errors='coerce')
    
    df['Date'] = d1.combine_first(d2)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    

    General solution for multiple formats:

    from functools import reduce 
    
    def convert_formats_to_datetimes(col, formats):
        out = [pd.to_datetime(col, format=x, errors='coerce') for x in formats]
        return reduce(lambda l,r: pd.Series.combine_first(l,r), out)
    
    formats = ['%Y-%m-%d %H:%M:%S', '%d/%m/%Y']
    df['Date'] = df['Date'].pipe(convert_formats_to_datetimes, formats)
    print (df)
            Date
    0 1983-11-10
    1 1983-11-10
    
    0 讨论(0)
提交回复
热议问题