Difference between evaluating timestamp and total_seconds

后端 未结 3 1934
感情败类
感情败类 2021-01-07 00:52

When I evaluate the number of seconds between two dates using two different methods (either using timestamp() or total_seconds()) in datetime in python, I get different resu

相关标签:
3条回答
  • 2021-01-07 01:27

    The discrepancy is caused by Daylight Savings Time. If one of your dates falls in your timezone's DST range, and the other does not, you end up with an off-by-one hour error in your calculation.

    From 1966 to 1973, DST in the United States ran from the last Sunday in April to the last Sunday in October, which explains @JoshuaRLi's findings.

    It looks like, when subtracting two dates, it's not paying attention to DST discrepancies; t1 - t2 produces datetime.timedelta(162), a difference of 162 days, even though technically, the difference in hours would be 162 * 24 - 1 hours (the - 1 accounting for the DST skip). timestamp is handling this (both timestamps are relative to UTC, so the DST timestamp correctly shows as one hour earlier, because there was an hour skipped to produce it).

    0 讨论(0)
  • 2021-01-07 01:42

    Daylight Savings Time

    The subtraction magic method of two dt.datetime objects creates a dt.timedelta that is not concerned with Daylight Savings Time.

    The epoch timestamp conversion function takes Daylight Savings Time into account, which explains the 3600 second (1 hour) difference.

    See my detective post below. This was fun!


    Whipped up a quick script, since this seemed interesting to me.

    This was run on both 3.5.4 and 3.6.2 with the same output.

    import datetime as dt
    
    t1 = dt.datetime(1970,1,1,0,0,0)
    t2 = dt.datetime(1970,1,1,0,0,0)
    
    for _ in range(365):
        try:
            d1 = t1.timestamp() - t2.timestamp()
            d2 = (t1-t2).total_seconds()
            assert d1 == d2
        except AssertionError as e:
            print(t1, d2-d1)
        t1 += dt.timedelta(days=1)
    

    I got this output. Looks like it starts on 4/27, and the difference is consistently an hour, meaning that the jump only happens once (actually never mind, keep reading)

    1970-04-27 00:00:00 3600.0
    1970-04-28 00:00:00 3600.0
    1970-04-29 00:00:00 3600.0
    ...
    

    I wrote a second script:

    import datetime as dt
    
    t = dt.datetime(1970,1,1,0,0,0)
    sid = 60*60*24
    
    while 1:
        prev = t
        t += dt.timedelta(days=1)
        diff1 = (t-prev).total_seconds()
        diff2 = t.timestamp() - prev.timestamp()
        try:
            assert diff1 == diff2 == sid
        except AssertionError:
            print(diff1, diff2, t, prev)
            exit(1)
    

    Output:

    86400.0 82800.0 1970-04-27 00:00:00 1970-04-26 00:00:00
    

    When you remove the exit(1), the output becomes interesting:

    86400.0 82800.0 1970-04-27 00:00:00 1970-04-26 00:00:00
    86400.0 90000.0 1970-10-26 00:00:00 1970-10-25 00:00:00
    86400.0 82800.0 1971-04-26 00:00:00 1971-04-25 00:00:00
    86400.0 90000.0 1971-11-01 00:00:00 1971-10-31 00:00:00
    86400.0 82800.0 1972-05-01 00:00:00 1972-04-30 00:00:00
    86400.0 90000.0 1972-10-30 00:00:00 1972-10-29 00:00:00
    86400.0 82800.0 1973-04-30 00:00:00 1973-04-29 00:00:00
    86400.0 90000.0 1973-10-29 00:00:00 1973-10-28 00:00:00
    86400.0 82800.0 1974-01-07 00:00:00 1974-01-06 00:00:00
    86400.0 90000.0 1974-10-28 00:00:00 1974-10-27 00:00:00
    86400.0 82800.0 1975-02-24 00:00:00 1975-02-23 00:00:00
    86400.0 90000.0 1975-10-27 00:00:00 1975-10-26 00:00:00
    86400.0 82800.0 1976-04-26 00:00:00 1976-04-25 00:00:00
    86400.0 90000.0 1976-11-01 00:00:00 1976-10-31 00:00:00
    ...
    

    Looks like epoch timestamp conversion t.timestamp() - prev.timestamp() isn't reliable. What's more, it appears to oscillate from minus to plus one hour at what appears to be a somewhat irregular but spaced out date interval (EDIT: realized these were historical Daylight Savings Time dates). If you keep the script running, the oscillation holds forever until we reach the end times:

    86400.0 82800.0 9997-03-10 00:00:00 9997-03-09 00:00:00
    86400.0 90000.0 9997-11-03 00:00:00 9997-11-02 00:00:00
    86400.0 82800.0 9998-03-09 00:00:00 9998-03-08 00:00:00
    86400.0 90000.0 9998-11-02 00:00:00 9998-11-01 00:00:00
    86400.0 82800.0 9999-03-15 00:00:00 9999-03-14 00:00:00
    86400.0 90000.0 9999-11-08 00:00:00 9999-11-07 00:00:00
    Traceback (most recent call last):
      File "check.py", line 8, in <module>
        t += dt.timedelta(days=1)
    OverflowError: date value out of range
    

    This behavior prompted me to take a closer look at the output of my first script:

    ...
    1970-10-24 00:00:00 3600.0
    1970-10-25 00:00:00 3600.0
    1971-04-26 00:00:00 3600.0
    1971-04-27 00:00:00 3600.0
    ...
    

    Wow, so there are no AssertionErrors between 1970-10-25 and 1971-04-26 noninclusive. This matches the oscillation found with the second script.

    This is getting really weird...

    Wait a moment... DAYLIGHT SAVINGS TIME

    0 讨论(0)
  • 2021-01-07 01:42

    .timestamp only works in Python 3 (New in version 3.3). There is no such method in Python 2.

    Changed in version 3.6: The timestamp() method uses the fold attribute to disambiguate the times during a repeated interval.

    Note: There is no method to obtain the POSIX timestamp directly from a naive datetime instance representing UTC time. If your application uses this convention and your system timezone is not set to UTC, you can obtain the POSIX timestamp by supplying tzinfo=timezone.utc: timestamp = dt.replace(tzinfo=timezone.utc).timestamp() or by calculating the timestamp directly: timestamp = (dt - datetime(1970, 1, 1)) / timedelta(seconds=1)

    0 讨论(0)
自定义标题
段落格式
字体
字号
代码语言
提交回复
热议问题