How do I convert correctly timezones

前端 未结 3 655
忘了有多久
忘了有多久 2021-02-11 02:17

I am using the fasttime package for its fastPOSIXct function that can read character datetimes very efficiently. My problem is that it can only read character datet

3条回答
  •  旧巷少年郎
    2021-02-11 02:56

    The smart thing to do here is almost certainly to write readable, easy-to-maintain code, and throw more hardware at the problem if your code is too slow.

    If you are desperate for a code speedup, then you could write a custom time-zone adjustment function. It isn't pretty, so if you have to convert between many time zones, you'll end up with spaghetti code. Here's my solution for the specific case of converting from GMT to Montreal time.

    First precompute a list of dates for daylight savings time. You'll need to extend this to before 2010/after 2013 in order to fit your dataset. I found the dates here

    http://www.timeanddate.com/worldclock/timezone.html?n=165

    montreal_tz_data <- cbind(
      start = fastPOSIXct(
        c("2010-03-14 07:00:00", "2011-03-13 07:00:00", "2012-03-11 07:00:00", "2013-03-10 07:00:00")
      ),
      end   = fastPOSIXct(
        c("2010-11-07 06:00:00", "2011-11-06 06:00:00", "2012-11-04 06:00:00", "2013-11-03 06:00:00")
      )
    )
    

    For speed, the function to change time zones treats the times as numbers.

    to_montreal_tz <- function(x)
    {
      x <- as.numeric(x)
      is_dst <- logical(length(x))  #initialise as FALSE
      #Loop over DST periods in each year
      for(row in seq_len(nrow(montreal_tz_data)))
      {
        is_dst[x > montreal_tz_data[row, 1] & x < montreal_tz_data[row, 2]] <- TRUE
      }
      #Hard-coded numbers are 4/5 hours in seconds
      ans <- ifelse(is_dst, x + 14400, x + 18000)
      class(ans) <- c("POSIXct", "POSIXt")
      ans
    }
    

    Now, to compare times:

    #A million dates
    ch <- rep("2010-03-15 12:37:17.223", 1e6)
    #The easy way (no conversion of time zones afterwards)
    system.time(as.POSIXct(ch, tz="America/Montreal"))
    #   user  system elapsed 
    #  28.96    0.05   29.00 
    
    #A slight performance gain by specifying the format
    system.time(as.POSIXct(ch, format = "%Y-%m-%d %H:%M:%S", tz="America/Montreal"))
    #   user  system elapsed 
    #  13.77    0.01   13.79 
    
    #Using the fast functions
    library(fasttime)
    system.time(to_montreal_tz(fastPOSIXct(ch)))    
    #    user  system elapsed 
    #    0.51    0.02    0.53 
    

    As with all optimisation tricks, you've either got a 27-fold speedup (yay!) or you've saved 13 seconds processing time but added 3 days of code-maintenance time from an obscure bug when you DST table runs out in 2035 (boo!).

提交回复
热议问题