How do I convert dates in a Pandas data frame to a 'date' data type?

前端 未结 10 2091
独厮守ぢ
独厮守ぢ 2020-11-28 02:56

I have a Pandas data frame, one of the column contains date strings in the format YYYY-MM-DD

For e.g. \'2013-10-28\'

At the moment th

相关标签:
10条回答
  • 2020-11-28 03:22

    I imagine a lot of data comes into Pandas from CSV files, in which case you can simply convert the date during the initial CSV read:

    dfcsv = pd.read_csv('xyz.csv', parse_dates=[0]) where the 0 refers to the column the date is in.
    You could also add , index_col=0 in there if you want the date to be your index.

    See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html

    0 讨论(0)
  • 2020-11-28 03:30

    For the sake of completeness, another option, which might not be the most straightforward one, a bit similar to the one proposed by @SSS, but using rather the datetime library is:

    import datetime
    df["Date"] = df["Date"].apply(lambda x: datetime.datetime.strptime(x, '%Y-%d-%m').date())
    
    0 讨论(0)
  • 2020-11-28 03:33

    It may be the case that dates need to be converted to a different frequency. In this case, I would suggest setting an index by dates.

    #set an index by dates
    df.set_index(['time'], drop=True, inplace=True)
    

    After this, you can more easily convert to the type of date format you will need most. Below, I sequentially convert to a number of date formats, ultimately ending up with a set of daily dates at the beginning of the month.

    #Convert to daily dates
    df.index = pd.DatetimeIndex(data=df.index)
    
    #Convert to monthly dates
    df.index = df.index.to_period(freq='M')
    
    #Convert to strings
    df.index = df.index.strftime('%Y-%m')
    
    #Convert to daily dates
    df.index = pd.DatetimeIndex(data=df.index)
    

    For brevity, I don't show that I run the following code after each line above:

    print(df.index)
    print(df.index.dtype)
    print(type(df.index))
    

    This gives me the following output:

    Index(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='object', name='time')
    object
    <class 'pandas.core.indexes.base.Index'>
    
    DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03'], dtype='datetime64[ns]', name='time', freq=None)
    datetime64[ns]
    <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
    
    PeriodIndex(['2013-01', '2013-01', '2013-01'], dtype='period[M]', name='time', freq='M')
    period[M]
    <class 'pandas.core.indexes.period.PeriodIndex'>
    
    Index(['2013-01', '2013-01', '2013-01'], dtype='object')
    object
    <class 'pandas.core.indexes.base.Index'>
    
    DatetimeIndex(['2013-01-01', '2013-01-01', '2013-01-01'], dtype='datetime64[ns]', freq=None)
    datetime64[ns]
    <class 'pandas.core.indexes.datetimes.DatetimeIndex'>
    
    0 讨论(0)
  • 2020-11-28 03:36

    If you want to get the DATE and not DATETIME format:

    df["id_date"] = pd.to_datetime(df["id_date"]).dt.date
    
    0 讨论(0)
  • 2020-11-28 03:40

    Use astype

    In [31]: df
    Out[31]: 
       a        time
    0  1  2013-01-01
    1  2  2013-01-02
    2  3  2013-01-03
    
    In [32]: df['time'] = df['time'].astype('datetime64[ns]')
    
    In [33]: df
    Out[33]: 
       a                time
    0  1 2013-01-01 00:00:00
    1  2 2013-01-02 00:00:00
    2  3 2013-01-03 00:00:00
    
    0 讨论(0)
  • 2020-11-28 03:40
     #   Column          Non-Null Count   Dtype         
    ---  ------          --------------   -----         
     0   startDay        110526 non-null  object
     1   endDay          110526 non-null  object
    
    import pandas as pd
    
    df['startDay'] = pd.to_datetime(df.startDay)
    
    df['endDay'] = pd.to_datetime(df.endDay)
    
     #   Column          Non-Null Count   Dtype         
    ---  ------          --------------   -----         
     0   startDay        110526 non-null  datetime64[ns]
     1   endDay          110526 non-null  datetime64[ns]
    
    0 讨论(0)
提交回复
热议问题