Convert string date to a different format in pandas dataframe

后端 未结 2 1203
忘掉有多难
忘掉有多难 2021-01-21 16:51

I have been looking for this answer in the community so far, could not have.

I have a dataframe in python 3.5.1 that contains a column with dates in string imported from

相关标签:
2条回答
  • 2021-01-21 17:14

    For most common date and datetime formats, pandas .to_datetime function can parse them without we providing format. For example:

    df.TimeStamp.apply(lambda x: pd.to_datetime(x))

    And in the example given from the question,

    df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

    will give us the same result.

    Using .apply will be efficient if you have multiple columns.

    Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html.

    0 讨论(0)
  • 2021-01-21 17:27

    If you convert the column of strings to a time series, you could use the dt.strftime method:

    import numpy as np
    import pandas as pd
    nan = np.nan
    df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
    df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
    print(df)
    

    yields

       TBD  TBD.1  TBD.2            TimeStamp     Value
    0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
    1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
    2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
    

    Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replace method:

    import numpy as np
    import pandas as pd
    nan = np.nan
    df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
    df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
    print(df)
    

    since

    In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
    Out[32]: 
    0    06/08/2016 17:19:53
    1    06/08/2016 17:19:54
    2    06/08/2016 17:19:54
    Name: TimeStamp, dtype: object
    

    This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

    0 讨论(0)
提交回复
热议问题