Parsing of Ordered Timestamps in Local Time (to UTC) While Observing Daylight Saving Time

后端 未结 2 1872
借酒劲吻你
借酒劲吻你 2020-11-27 08:26

I have CSV data files with timestamped records that are in local time. Unfortunately the data files cover the period where daylight saving time changes (Nov 3rd 2013) so th

相关标签:
2条回答
  • 2020-11-27 08:55

    In C#:

    // Define the input values.
    string[] input =
    {
        "2013-11-03 00:45:00",
        "2013-11-03 01:00:00",
        "2013-11-03 01:15:00",
        "2013-11-03 01:30:00",
        "2013-11-03 01:45:00",
        "2013-11-03 01:00:00",
        "2013-11-03 01:15:00",
        "2013-11-03 01:30:00",
        "2013-11-03 01:45:00",
        "2013-11-03 02:00:00",
    };
    
    // Get the time zone the input is meant to be interpreted in.
    TimeZoneInfo tz = TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time");
    
    // Create an array for the output values
    DateTimeOffset[] output = new DateTimeOffset[input.Length];
    
    // Start with the assumption that DST is active, as ambiguities occur when moving
    // out of daylight time into standard time.
    bool dst = true;
    
    // Iterate through the input.
    for (int i = 0; i < input.Length; i++)
    {
        // Parse the input string as a DateTime with Unspecified kind
        DateTime dt = DateTime.ParseExact(input[i], "yyyy-MM-dd HH:mm:ss",
                                          CultureInfo.InvariantCulture);
    
        // Determine the offset.
        TimeSpan offset;
        if (tz.IsAmbiguousTime(dt))
        {
            // Get the possible offsets, and use the DST flag and the previous entry
            // to determine if we are past the transition point.  This only works
            // because we have outside knowledge that the items are in sequence.
            TimeSpan[] offsets = tz.GetAmbiguousTimeOffsets(dt);
            offset = dst && (i == 0 || dt >= output[i - 1].DateTime)
                     ? offsets[1] : offsets[0];
        }
        else
        {
            // The value is unambiguous, so just get the single offset it can be.
            offset = tz.GetUtcOffset(dt);
        }
    
        // Use the determined values to construct a DateTimeOffset
        DateTimeOffset dto = new DateTimeOffset(dt, offset);
    
        // We can unambiguously check a DateTimeOffset for daylight saving time,
        // which sets up the DST flag for the next iteration.
        dst = tz.IsDaylightSavingTime(dto);
    
        // Save the DateTimeOffset to the output array.
        output[i] = dto;
    }
    
    
    // Show the output for debugging
    foreach (var dto in output)
    {
        Console.WriteLine("{0:yyyy-MM-dd HH:mm:ss zzzz} => {1:yyyy-MM-dd HH:mm:ss} UTC",
                           dto, dto.UtcDateTime);
    }
    

    Output:

    2013-11-03 00:45:00 -04:00 => 2013-11-03 04:45:00 UTC
    2013-11-03 01:00:00 -04:00 => 2013-11-03 05:00:00 UTC
    2013-11-03 01:15:00 -04:00 => 2013-11-03 05:15:00 UTC
    2013-11-03 01:30:00 -04:00 => 2013-11-03 05:30:00 UTC
    2013-11-03 01:45:00 -04:00 => 2013-11-03 05:45:00 UTC
    2013-11-03 01:00:00 -05:00 => 2013-11-03 06:00:00 UTC
    2013-11-03 01:15:00 -05:00 => 2013-11-03 06:15:00 UTC
    2013-11-03 01:30:00 -05:00 => 2013-11-03 06:30:00 UTC
    2013-11-03 01:45:00 -05:00 => 2013-11-03 06:45:00 UTC
    2013-11-03 02:00:00 -05:00 => 2013-11-03 07:00:00 UTC
    

    Note that this assumes that the first time you encounter an ambiguous time like 1:00 that it will be in DST. Say your list was truncated to just the last 5 entries - you wouldn't know that those were in standard time. There's not much you could do in that particular case.

    0 讨论(0)
  • 2020-11-27 08:56

    If successive timestamps can't go backwards if expressed as time in UTC then this Python script can convert the local time into UTC:

    #!/usr/bin/env python3
    import sys
    from datetime import datetime, timedelta
    import pytz  # $ pip install pytz
    
    tz = pytz.timezone('America/New_York' if len(sys.argv) < 2 else sys.argv[1])
    previous = None #XXX set it from UTC time: `first_entry_utc.astimezone(tz)`
    for line in sys.stdin: # read from stdin
        naive = datetime.strptime(line.strip(), "%Y/%m/%d %H:%M:%S") # no timezone
        try:
            local = tz.localize(naive, is_dst=None) # attach timezone info
        except pytz.AmbiguousTimeError:
            # assume ambiguous time always corresponds to True -> False transition
            local = tz.localize(naive, is_dst=True)
            if previous >= local: # timestamps must be increasing
                local = tz.localize(naive, is_dst=False)
            assert previous < local
        #NOTE: allow NonExistentTimeError to propagate (there shouldn't be
        # invalid local times in the input)
        previous = local
        utc = local.astimezone(pytz.utc)
        timestamp = utc.timestamp()
        time_format = "%Y-%m-%d %H:%M:%S %Z%z"
        print("{local:{time_format}}; {utc:{time_format}}; {timestamp:.0f}"
              .format_map(vars()))
    

    Input

    2013/11/03 00:45:00
    2013/11/03 01:00:00
    2013/11/03 01:15:00
    2013/11/03 01:30:00
    2013/11/03 01:45:00
    2013/11/03 01:00:00
    2013/11/03 01:15:00
    2013/11/03 01:30:00
    2013/11/03 01:45:00
    2013/11/03 02:00:00
    

    Output

    2013-11-03 00:45:00 EDT-0400; 2013-11-03 04:45:00 UTC+0000; 1383453900
    2013-11-03 01:00:00 EDT-0400; 2013-11-03 05:00:00 UTC+0000; 1383454800
    2013-11-03 01:15:00 EDT-0400; 2013-11-03 05:15:00 UTC+0000; 1383455700
    2013-11-03 01:30:00 EDT-0400; 2013-11-03 05:30:00 UTC+0000; 1383456600
    2013-11-03 01:45:00 EDT-0400; 2013-11-03 05:45:00 UTC+0000; 1383457500
    2013-11-03 01:00:00 EST-0500; 2013-11-03 06:00:00 UTC+0000; 1383458400
    2013-11-03 01:15:00 EST-0500; 2013-11-03 06:15:00 UTC+0000; 1383459300
    2013-11-03 01:30:00 EST-0500; 2013-11-03 06:30:00 UTC+0000; 1383460200
    2013-11-03 01:45:00 EST-0500; 2013-11-03 06:45:00 UTC+0000; 1383461100
    2013-11-03 02:00:00 EST-0500; 2013-11-03 07:00:00 UTC+0000; 1383462000
    
    0 讨论(0)
提交回复
热议问题