Combine date as integer and time as factor to POSIXct in R

问题

I know this has been asked several times and I looked at the questions and followed the suggestions. However, I couldn't solve this one.

The datetime.csv can be found on https://www.dropbox.com/s/6bvhk4kei4pg8zq/datetime.csv

My code looks like:

jd1 <- read.csv("datetime.csv")
head(jd1)
      Date Time
1 20100101 0:00
2 20100101 1:00
3 20100101 2:00
4 20100101 3:00
5 20100101 4:00
6 20100101 5:00

sapply(jd1,class)
> sapply(jd1,class)
     Date      Time 
"integer"  "factor"

jd1 <- transform(jd1, timestamp=format(as.POSIXct(paste(Date, Time)), "%Y%m%d %H:%M:%S"))
Error in as.POSIXlt.character(x, tz, ...) : 
character string is not in a standard unambiguous format

I tried the solution suggested by rcs on Converting two columns of date and time data to one but this seems to give an error.

Any help is highly appreciated.

Thanks.

回答1:

The format string you're passing to format includes %S which you don't have. But that won't fix the error since its coming from as.POSIXct. You need to pass the format string there instead and remove the call to the format function.

foo <- transform(jd1, timestamp=as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M"))
str(foo)

Compare this to:

bar <- transform(jd1, timestamp=as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M:%S"))
str(bar)

And the result of calling format:

baz <- transform(jd1, timestamp=format(as.POSIXct(paste(Date, Time), format="%Y%m%d %H:%M"), format='%Y%m%d %H:%M:%S'))
str(baz)

回答2:

if it's just this file you don't even need to read it as csv. Following will do

# if you are reading just timestamps, you may want to read it as just one column
jd1 <- read.table("datetime.csv", header = TRUE, colClasses = c("character"))
jd1$timestamp <- as.POSIXct(jd1$Date.Time, format = "%Y%m%d,%H:%M")
head(jd1)
##       Date.Time           timestamp
## 1 20100101,0:00 2010-01-01 00:00:00
## 2 20100101,1:00 2010-01-01 01:00:00
## 3 20100101,2:00 2010-01-01 02:00:00
## 4 20100101,3:00 2010-01-01 03:00:00
## 5 20100101,4:00 2010-01-01 04:00:00
## 6 20100101,5:00 2010-01-01 05:00:00



# if you must read it as seperate columns as you may have other columns in your file
jd2 <- read.csv("datetime.csv", header = TRUE, colClasses = c("character", "character"))
jd2$timestamp <- as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M")
head(jd2)
##       Date Time           timestamp
## 1 20100101 0:00 2010-01-01 00:00:00
## 2 20100101 1:00 2010-01-01 01:00:00
## 3 20100101 2:00 2010-01-01 02:00:00
## 4 20100101 3:00 2010-01-01 03:00:00
## 5 20100101 4:00 2010-01-01 04:00:00
## 6 20100101 5:00 2010-01-01 05:00:00

Arun's comment prompted me to do some benchmarking..

jd2 <- read.csv("datetime.csv", header = TRUE, colClasses = c("character", "character"))
library(microbenchmark)
microbenchmark(as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M"), as.POSIXct(do.call(paste, c(jd2[c("Date", "Time")])), format = "%Y%m%d %H:%M"), 
    transform(jd2, timestamp = as.POSIXct(paste(Date, Time), format = "%Y%m%d %H:%M")), times = 100)
## Unit: milliseconds
##                                                                                expr      min       lq   median       uq      max neval
##           as.POSIXct(paste(jd2$Date, jd2$Time, sep = " "), format = "%Y%m%d %H:%M") 18.84720 18.87736 18.89542 18.93307 20.99021   100
##      as.POSIXct(do.call(paste, c(jd2[c("Date", "Time")])), format = "%Y%m%d %H:%M") 18.94440 18.97917 18.99492 19.02220 21.07320   100
##  transform(jd2, timestamp = as.POSIXct(paste(Date, Time), format = "%Y%m%d %H:%M")) 19.05581 19.10230 19.12612 19.16877 21.27490   100

来源：https://stackoverflow.com/questions/15367210/combine-date-as-integer-and-time-as-factor-to-posixct-in-r

标签

datetime

posixct