How do I convert correctly timezones

前端 未结 3 2130
渐次进展
渐次进展 2021-02-11 02:10

I am using the fasttime package for its fastPOSIXct function that can read character datetimes very efficiently. My problem is that it can only read character datet

相关标签:
3条回答
  • 2021-02-11 02:38

    Could you not just add the appropriate number of seconds to correct the offset from GMT?

    # Original problem
    fastPOSIXct("2010-03-15 12:37:17.223",tz="America/Montreal")
    # [1] "2010-03-15 08:37:17 EDT"
    
    # Add 4 hours worth of seconds to the data. This should be very quick.
    fastPOSIXct("2010-03-15 12:37:17.223",tz="America/Montreal") + 14400
    # [1] "2010-03-15 12:37:17 EDT"
    
    0 讨论(0)
  • 2021-02-11 02:48

    The smart thing to do here is almost certainly to write readable, easy-to-maintain code, and throw more hardware at the problem if your code is too slow.

    If you are desperate for a code speedup, then you could write a custom time-zone adjustment function. It isn't pretty, so if you have to convert between many time zones, you'll end up with spaghetti code. Here's my solution for the specific case of converting from GMT to Montreal time.

    First precompute a list of dates for daylight savings time. You'll need to extend this to before 2010/after 2013 in order to fit your dataset. I found the dates here

    http://www.timeanddate.com/worldclock/timezone.html?n=165

    montreal_tz_data <- cbind(
      start = fastPOSIXct(
        c("2010-03-14 07:00:00", "2011-03-13 07:00:00", "2012-03-11 07:00:00", "2013-03-10 07:00:00")
      ),
      end   = fastPOSIXct(
        c("2010-11-07 06:00:00", "2011-11-06 06:00:00", "2012-11-04 06:00:00", "2013-11-03 06:00:00")
      )
    )
    

    For speed, the function to change time zones treats the times as numbers.

    to_montreal_tz <- function(x)
    {
      x <- as.numeric(x)
      is_dst <- logical(length(x))  #initialise as FALSE
      #Loop over DST periods in each year
      for(row in seq_len(nrow(montreal_tz_data)))
      {
        is_dst[x > montreal_tz_data[row, 1] & x < montreal_tz_data[row, 2]] <- TRUE
      }
      #Hard-coded numbers are 4/5 hours in seconds
      ans <- ifelse(is_dst, x + 14400, x + 18000)
      class(ans) <- c("POSIXct", "POSIXt")
      ans
    }
    

    Now, to compare times:

    #A million dates
    ch <- rep("2010-03-15 12:37:17.223", 1e6)
    #The easy way (no conversion of time zones afterwards)
    system.time(as.POSIXct(ch, tz="America/Montreal"))
    #   user  system elapsed 
    #  28.96    0.05   29.00 
    
    #A slight performance gain by specifying the format
    system.time(as.POSIXct(ch, format = "%Y-%m-%d %H:%M:%S", tz="America/Montreal"))
    #   user  system elapsed 
    #  13.77    0.01   13.79 
    
    #Using the fast functions
    library(fasttime)
    system.time(to_montreal_tz(fastPOSIXct(ch)))    
    #    user  system elapsed 
    #    0.51    0.02    0.53 
    

    As with all optimisation tricks, you've either got a 27-fold speedup (yay!) or you've saved 13 seconds processing time but added 3 days of code-maintenance time from an obscure bug when you DST table runs out in 2035 (boo!).

    0 讨论(0)
  • 2021-02-11 02:48

    It's a daylight savings issue: http://www.timeanddate.com/time/dst/2010a.html

    In 2010 it began on the 14th March in Canada, but not until the 28th March in the UK.

    You can use POSIXlt objects to modify timezones directly:

    lt <- as.POSIXlt(as.POSIXct("2010-03-15 12:37:17.223",tz="GMT"))
    attr(lt,"tzone") <- "America/Montreal"
    as.POSIXct(lt)
    [1] "2010-03-15 12:37:17 EDT"
    

    Or you could use format to convert to a string and set the timezone in a call to as.POSIXct. You can therefore modify forceTZ:

    forceTZ <- function(x,tz)
    {
      return(as.POSIXct(format(x),tz=tz))
    }
    
    
    forceTZ(as.POSIXct("2010-03-15 12:37:17.223",tz="GMT"),"America/Montreal")
    [1] "2010-03-15 12:37:17 EDT"
    
    0 讨论(0)
提交回复
热议问题