Converting to Local Time in R - Vector of Timezones

霸气de小男生 提交于 2019-12-30 10:06:55

问题


I have a set of data from across the US that I am trying to convert into local time for each "subject". I have UTC timestamps on each event and have converted those into POSIXct format, but every time I try to include a vector of tz = DS$Factor or tz = as.character(DS$Factor) in any of the POSIXct/POSIXlt functions (including format() and strftime()) I get an error that says:

Error in as.POSIXlt.POSIXct(x, tz = tz) : invalid 'tz' value

If I just enter tz = 'US/Eastern' it works fine, but of course not all of my values are from that time zone.

How do I get the time stamps into local time for each "subject"?

The DS$Factor has 5 values: US/Arizona US/Central US/Eastern US/Mountain US/Pacific

Thanks, Shorthand


回答1:


Bringing in dplyr and lubridate, I wound up doing something like:

require(lubridate)
require(dplyr)

df = data.frame(timestring = c("2015-12-12 13:34:56", "2015-12-14 16:23:32"),
                localzone = c("America/Los_Angeles", "America/New_York"), stringsAsFactors = F)

df$moment = as.POSIXct(df$timestring, format="%Y-%m-%d %H:%M:%S", tz="UTC")

df = df %>% rowwise() %>% mutate(localtime = force_tz(moment, localzone))

df



回答2:


Actually, what I did was to loop through the timezones instead of the number of rows in the data set ... then its much, much faster. I'll post code tomorrow.

In general, that's a lesson for R: don't loop through the big data frame, loop through the (much shorter) vector of categories and apply using the which() function.

As there are only 5 time zones, the loop only takes a few seconds now.

One other caveat is that if you put it into POSIXct format it will still graph the times in your machine's local timezone. So you need an extra step to then covert it into local time using force_tz().

cap$tdiff is really just created to make sure that the code is doing what it says it should be doing.

library("lubridate")    

tzs <- as.character(unique(cap$timezone))

cap$localtimes <- as.POSIXlt(0,origin = "1970-01-01")

#now loop through by timezone instead of lines of cap[]
for (i in 1:length(tzs)) {
  whichrows <- which(cap$timezone == tzs[i])

  cap[whichrows,"localtimes"] <-
    with_tz(cap[whichrows,"UTC"],tzone = tzs[i])
}

remove(i, whichrows)

cap$tdiff <- as.numeric((force_tz(cap$localtime, "UTC") - cap$UTC))
cap$localtime <- as.POSIXct(force_tz(cap$localtimes))



回答3:


So I was able to create a for loop to do this, but it is slow, taking about 10 minutes to run. I couldn't figure out an apply() sytnax, and would certainly appreciate some help creating a faster, more parallelizable way of doing this operation as the datastore has 768k observations and growing.

>     require(lubridate)
>     
>     loct = NULL for (i in 1:nrow(DS))
>     {
>       loct[i] <- with_tz(DS$UTC[i],tzone =
>       ifelse(DS$timezone[i]=="","US/Eastern",as.character(DS$timezone[i])))
>     } DS$localtime <- as.POSIXct(loct, origin ="1970-01-01") remove (loct, i)


来源:https://stackoverflow.com/questions/32084042/converting-to-local-time-in-r-vector-of-timezones

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!