Get the time spent since midnight in dataframe

最后都变了- 提交于 2021-01-28 09:59:55

问题


I have a dataframe which has a column of type Timestamp. I want to find the time elapsed (in seconds) since midnight as a new column. How to do it in a simple way ?

Eg : Input :

samples['time']
2018-10-01 00:00:01.000000000
2018-10-01 00:00:12.000000000

type(samples['time'].iloc[0])

<class 'pandas._libs.tslib.Timestamp'>

Output :

samples['time_elapsed']
1
12

回答1:


Note that the date part in each row may be other (not from one and the same day), so you can not take any "base date" (midnight) for the whole DataFrame, as it can be seen in one of other solutions.

My intention was also not to "contaminate" the source DataFrame with any intermediate columns, e.g. the time (actually date and time) as string converted to "true" DateTime.

Then my proposition is:

  • convert the DateTime string to DateTime,
  • take the time part from it,
  • compute the number of seconds from hour / minute / second part.

All the above steps in a dedicated function.

So to do the task, define a function:

def secSinceNoon(datTimStr):
    tt = pd.to_datetime(datTimStr).time()
    return tt.hour * 3600 + tt.minute * 60 + tt.second

Then call:

samples['Secs'] = samples.time.apply(secSinceNoon)

For source data:

samples = pd.DataFrame(data=[
    [ '2018-10-01 00:00:01' ], [ '2018-10-01 00:00:12' ],
    [ '2018-11-02 01:01:10' ], [ '2018-11-04 03:02:15' ] ],
    columns = ['time']);

when you print the result, you will see:

                  time   Secs
0  2018-10-01 00:00:01      1
1  2018-10-01 00:00:12     12
2  2018-11-02 01:01:10   3670
3  2018-11-04 03:02:15  10935



回答2:


Doing this in Pandas is very simple!

midnight = pd.Timestamp('2018-10-01 00:00:00')
print(pd.Timestamp('2018-10-01 00:00:01.000000000') - midnight).seconds
>
1

And by extension we can use an apply on a Pandas Series:

samples = pd.DataFrame(['2018-10-01 00:00:01.000000000', '2018-10-01 00:00:12.000000000'], columns=['time'])
samples.time = pd.to_datetime(samples.time)
midnight = pd.Timestamp('2018-10-01 00:00:00')
samples['time_elapsed'] = samples['time'].apply(lambda x: (x - midnight).seconds)
samples
>
        time    time_elapsed
0   2018-10-01 00:00:01     1
1   2018-10-01 00:00:12     12

Note that the answers here use an alternative method: comparing the timestamp to itself converted to a date. This zeros all time data and so is the equivalent of midnight of that day. This method might be slightly more performant.




回答3:


Current answers either too complicated or specialized.

samples = pd.DataFrame(data=['2018-10-01 00:00:01', '2018-10-01 00:00:12'], columns=['time'], dtype='datetime64[ns]')

samples['time_elapsed'] = ((samples['time'] - samples['time'].dt.normalize()) / pd.Timedelta('1 second')).astype(int)

print(samples)
                 time  time_elapsed
0 2018-10-01 00:00:01             1
1 2018-10-01 00:00:12            12
  • normalize() removes the time component from the datetime (moves clock back to midnight).
  • pd.Timedelta('1 s') sets the unit of measurement, i.e. number of seconds in the timedelta.
  • .astype(int) casts the decimal number of seconds to int. Use round functionality if that is preferred.



回答4:


I ran into the same problem in my one of my projects and here's how I solved it (assuming your time column has already been converted to Timestamp):

(samples['time'] - samples['time'].dt.normalize()) / pd.Timedelta(seconds=1)

The beauty of this approach is that you can change the last part to get seconds, minutes, hours or days elapsed:

... / pd.Timedelta(seconds=1) # seconds elapsed
... / pd.Timedelta(minutes=1) # minutes elapsed
... / pd.Timedelta(hours=1)   # hours elapsed
... / pd.Timedelta(days=1)    # days elapsed



回答5:


We can do :

    samples['time'].dt.hour * 3600 + 
    samples['time'].dt.minute * 60 + 
    samples['time'].dt.second


来源:https://stackoverflow.com/questions/54787146/get-the-time-spent-since-midnight-in-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!