Combine date column and time column into datetime column

核能气质少年 提交于 2020-01-13 09:34:54

问题


I have a Pandas dataframe like this; (obtained by parsing an excel file)

|     |     COMPANY NAME           | MEETING DATE        | MEETING TIME|
-----------------------------------------------------------------------|
|YKSGR|    YAPI KREDİ SİGORTA A.Ş. | 2013-12-16 00:00:00 |14:00:00     |
|TRCAS|    TURCAS PETROL A.Ş.      | 2013-12-12 00:00:00 |13:30:00     |

Column MEETING DATE is a timestamp with a representation like Timestamp('2013-12-20 00:00:00', tz=None) and MEETING TIME is a datetime.time object with a representation like datetime.time(14, 0)

I want to combine MEETING DATE and MEETING TIME into one column. datetime.combine seems to do what I want, however, I need to apply this function column-wise somehow. How can I achieve this?


回答1:


You can use apply method, and apply combine like this:

>>> df.apply(lambda x: combine(x['MEETING DATE'], x['MEETING TIME']), axis=1)
0   2013-12-16 14:00:00
1   2013-12-12 13:00:00



回答2:


Other solutions didn't work for me, so I came up with a workaround using replace instead of combine:

def combine_date_time(df, datecol, timecol):
   return df.apply(lambda row: row[datecol].replace(
      hour=row[timecol].hour,
      minute=row[timecol].minute),
      axis=1
   )

In your case:

combine_date_time(df, 'MEETING DATE', 'MEETING TIME')

It feels slow (I haven't timed it properly), but it works.

UPDATE: I have timed both approaches for a relatively large dataset (>500.000 rows), and they both have similar run times, but using combine is faster (59s for replace vs 50s for combine). Also, see jezrael answer on this.

UPDATE2: I have tried jezrael's approach:

def combine_date_time(df, datecol, timecol):
    return pd.to_datetime(df[datecol].dt.date.astype(str)
                          + ' '
                          + df[timecol].astype(str))

This approach is blazing fast in comparison, jezrael is right. I haven't been able to measure it though, but it is evident.




回答3:


You can convert Time column first to string and then to_timedelta, then is easy sum both columns:

print (type(df['MEETING DATE'].iat[0]))
<class 'pandas.tslib.Timestamp'>

print (type(df['MEETING TIME'].iat[0]))
<class 'datetime.time'>

print (df['MEETING DATE'] + pd.to_timedelta(df['MEETING TIME'].astype(str)))
YKSGR   2013-12-16 14:00:00
TRCAS   2013-12-12 13:30:00
dtype: datetime64[ns]


来源:https://stackoverflow.com/questions/20009408/combine-date-column-and-time-column-into-datetime-column

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!