Convert a column of datetimes to epoch in Python

前端 未结 3 1014
青春惊慌失措
青春惊慌失措 2020-12-09 04:46

I\'m currently having an issue with Python. I have a Pandas DataFrame and one of the columns is a string with a date. The format is :

\"%Y-%m-%d %H:%m

相关标签:
3条回答
  • 2020-12-09 04:56

    I know this is old but I believe the cleanest way is this:

    int(pd.Timestamp("20200918 20:30:05").value/1000000)
    

    Gives 1600461005000 which is the epoch of the date above. The .value attribute is the number of nanoseconds since epoch so we divide by 1e6 to get to milliseconds. Divide by 1e9 if you want epoch in seconds.

    0 讨论(0)
  • 2020-12-09 05:04

    convert the string to a datetime using to_datetime and then subtract datetime 1970-1-1 and call dt.total_seconds():

    In [2]:
    import pandas as pd
    import datetime as dt
    df = pd.DataFrame({'date':['2011-04-24 01:30:00.000']})
    df
    
    Out[2]:
                          date
    0  2011-04-24 01:30:00.000
    
    In [3]:
    df['date'] = pd.to_datetime(df['date'])
    df
    
    Out[3]:
                     date
    0 2011-04-24 01:30:00
    
    In [6]:    
    (df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()
    
    Out[6]:
    0    1303608600
    Name: date, dtype: float64
    

    You can see that converting this value back yields the same time:

    In [8]:
    pd.to_datetime(1303608600, unit='s')
    
    Out[8]:
    Timestamp('2011-04-24 01:30:00')
    

    So you can either add a new column or overwrite:

    In [9]:
    df['epoch'] = (df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()
    df
    
    Out[9]:
                     date       epoch
    0 2011-04-24 01:30:00  1303608600
    

    EDIT

    better method as suggested by @Jeff:

    In [3]:
    df['date'].astype('int64')//1e9
    
    Out[3]:
    0    1303608600
    Name: date, dtype: float64
    
    In [4]:
    %timeit (df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()
    %timeit df['date'].astype('int64')//1e9
    
    100 loops, best of 3: 1.72 ms per loop
    1000 loops, best of 3: 275 µs per loop
    

    You can also see that it is significantly faster

    0 讨论(0)
  • 2020-12-09 05:16

    From the Pandas documentation on working with time series data:

    We subtract the epoch (midnight at January 1, 1970 UTC) and then floor divide by the “unit” (1 ms).

    # generate some timestamps
    stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='D')
    
    # convert it to milliseconds from epoch
    (stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1ms')
    

    This will give the epoch time in milliseconds.

    0 讨论(0)
提交回复
热议问题