问题
I have a set of data from across the US that I am trying to convert into local time for each "subject". I have UTC timestamps on each event and have converted those into POSIXct format, but every time I try to include a vector of tz = DS$Factor
or tz = as.character(DS$Factor)
in any of the POSIXct/POSIXlt functions (including format()
and strftime()
) I get an error that says:
Error in as.POSIXlt.POSIXct(x, tz = tz) : invalid 'tz' value
If I just enter tz = 'US/Eastern'
it works fine, but of course not all of my values are from that time zone.
How do I get the time stamps into local time for each "subject"?
The DS$Factor
has 5 values: US/Arizona US/Central US/Eastern US/Mountain US/Pacific
Thanks, Shorthand
回答1:
Bringing in dplyr and lubridate, I wound up doing something like:
require(lubridate)
require(dplyr)
df = data.frame(timestring = c("2015-12-12 13:34:56", "2015-12-14 16:23:32"),
localzone = c("America/Los_Angeles", "America/New_York"), stringsAsFactors = F)
df$moment = as.POSIXct(df$timestring, format="%Y-%m-%d %H:%M:%S", tz="UTC")
df = df %>% rowwise() %>% mutate(localtime = force_tz(moment, localzone))
df
回答2:
Actually, what I did was to loop through the timezones instead of the number of rows in the data set ... then its much, much faster. I'll post code tomorrow.
In general, that's a lesson for R: don't loop through the big data frame, loop through the (much shorter) vector of categories and apply using the which() function.
As there are only 5 time zones, the loop only takes a few seconds now.
One other caveat is that if you put it into POSIXct format it will still graph the times in your machine's local timezone. So you need an extra step to then covert it into local time using force_tz().
cap$tdiff is really just created to make sure that the code is doing what it says it should be doing.
library("lubridate")
tzs <- as.character(unique(cap$timezone))
cap$localtimes <- as.POSIXlt(0,origin = "1970-01-01")
#now loop through by timezone instead of lines of cap[]
for (i in 1:length(tzs)) {
whichrows <- which(cap$timezone == tzs[i])
cap[whichrows,"localtimes"] <-
with_tz(cap[whichrows,"UTC"],tzone = tzs[i])
}
remove(i, whichrows)
cap$tdiff <- as.numeric((force_tz(cap$localtime, "UTC") - cap$UTC))
cap$localtime <- as.POSIXct(force_tz(cap$localtimes))
回答3:
So I was able to create a for loop to do this, but it is slow, taking about 10 minutes to run. I couldn't figure out an apply()
sytnax, and would certainly appreciate some help creating a faster, more parallelizable way of doing this operation as the datastore has 768k observations and growing.
> require(lubridate)
>
> loct = NULL for (i in 1:nrow(DS))
> {
> loct[i] <- with_tz(DS$UTC[i],tzone =
> ifelse(DS$timezone[i]=="","US/Eastern",as.character(DS$timezone[i])))
> } DS$localtime <- as.POSIXct(loct, origin ="1970-01-01") remove (loct, i)
来源:https://stackoverflow.com/questions/32084042/converting-to-local-time-in-r-vector-of-timezones