How do I convert correctly timezones

时光总嘲笑我的痴心妄想 提交于 2020-01-01 03:42:06

问题


I am using the fasttime package for its fastPOSIXct function that can read character datetimes very efficiently. My problem is that it can only read character datetimes THAT ARE EXPRESSED IN GMT.

R) fastPOSIXct("2010-03-15 12:37:17.223",tz="GMT") #very fast
[1] "2010-03-15 12:31:16.223 GMT"
R) as.POSIXct("2010-03-15 12:37:17.223",tz="GMT") #very slow
[1] "2010-03-15 12:31:16.223 GMT"

Now, say I have a file with datetimes expressed in "America/Montral" timezone, the plan is to load them (implicitely pretending they are in GMT) and modifying subsequently the timezone attribute without changing the underlying value.

If I use this function, refered in another post:

forceTZ = function(x,tz){   
    return(as.POSIXct(as.numeric(x), origin=as.POSIXct("1970-01-01",tz=tz), tz=tz))
}

I am seeing a bug ...

R) forceTZ(as.POSIXct("2010-03-15 12:37:17.223",tz="GMT"),"America/Montreal")
    [1] "2010-03-15 13:37:17.223 EDT"

... because I would like it to be

R) as.POSIXct("2010-03-15 12:37:17.223",format="%Y-%m-%d %H:%M:%OS",tz="America/Montreal")
    [1] "2010-03-15 12:37:17.223 EDT"

Is there a workaround ?

EDIT: I know about lubridate::force_tz but it is too slow (not point using fasttime::fastPOSIXct anymore )


回答1:


The smart thing to do here is almost certainly to write readable, easy-to-maintain code, and throw more hardware at the problem if your code is too slow.

If you are desperate for a code speedup, then you could write a custom time-zone adjustment function. It isn't pretty, so if you have to convert between many time zones, you'll end up with spaghetti code. Here's my solution for the specific case of converting from GMT to Montreal time.

First precompute a list of dates for daylight savings time. You'll need to extend this to before 2010/after 2013 in order to fit your dataset. I found the dates here

http://www.timeanddate.com/worldclock/timezone.html?n=165

montreal_tz_data <- cbind(
  start = fastPOSIXct(
    c("2010-03-14 07:00:00", "2011-03-13 07:00:00", "2012-03-11 07:00:00", "2013-03-10 07:00:00")
  ),
  end   = fastPOSIXct(
    c("2010-11-07 06:00:00", "2011-11-06 06:00:00", "2012-11-04 06:00:00", "2013-11-03 06:00:00")
  )
)

For speed, the function to change time zones treats the times as numbers.

to_montreal_tz <- function(x)
{
  x <- as.numeric(x)
  is_dst <- logical(length(x))  #initialise as FALSE
  #Loop over DST periods in each year
  for(row in seq_len(nrow(montreal_tz_data)))
  {
    is_dst[x > montreal_tz_data[row, 1] & x < montreal_tz_data[row, 2]] <- TRUE
  }
  #Hard-coded numbers are 4/5 hours in seconds
  ans <- ifelse(is_dst, x + 14400, x + 18000)
  class(ans) <- c("POSIXct", "POSIXt")
  ans
}

Now, to compare times:

#A million dates
ch <- rep("2010-03-15 12:37:17.223", 1e6)
#The easy way (no conversion of time zones afterwards)
system.time(as.POSIXct(ch, tz="America/Montreal"))
#   user  system elapsed 
#  28.96    0.05   29.00 

#A slight performance gain by specifying the format
system.time(as.POSIXct(ch, format = "%Y-%m-%d %H:%M:%S", tz="America/Montreal"))
#   user  system elapsed 
#  13.77    0.01   13.79 

#Using the fast functions
library(fasttime)
system.time(to_montreal_tz(fastPOSIXct(ch)))    
#    user  system elapsed 
#    0.51    0.02    0.53 

As with all optimisation tricks, you've either got a 27-fold speedup (yay!) or you've saved 13 seconds processing time but added 3 days of code-maintenance time from an obscure bug when you DST table runs out in 2035 (boo!).




回答2:


It's a daylight savings issue: http://www.timeanddate.com/time/dst/2010a.html

In 2010 it began on the 14th March in Canada, but not until the 28th March in the UK.

You can use POSIXlt objects to modify timezones directly:

lt <- as.POSIXlt(as.POSIXct("2010-03-15 12:37:17.223",tz="GMT"))
attr(lt,"tzone") <- "America/Montreal"
as.POSIXct(lt)
[1] "2010-03-15 12:37:17 EDT"

Or you could use format to convert to a string and set the timezone in a call to as.POSIXct. You can therefore modify forceTZ:

forceTZ <- function(x,tz)
{
  return(as.POSIXct(format(x),tz=tz))
}


forceTZ(as.POSIXct("2010-03-15 12:37:17.223",tz="GMT"),"America/Montreal")
[1] "2010-03-15 12:37:17 EDT"



回答3:


Could you not just add the appropriate number of seconds to correct the offset from GMT?

# Original problem
fastPOSIXct("2010-03-15 12:37:17.223",tz="America/Montreal")
# [1] "2010-03-15 08:37:17 EDT"

# Add 4 hours worth of seconds to the data. This should be very quick.
fastPOSIXct("2010-03-15 12:37:17.223",tz="America/Montreal") + 14400
# [1] "2010-03-15 12:37:17 EDT"


来源:https://stackoverflow.com/questions/15816534/how-do-i-convert-correctly-timezones

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!