I have a vector of POSIXct values and I would like to round them to the nearest quarter hour. I don\'t care about the day. How do I convert the values to hours and minutes?<
Indeed, an old question with some helpful answers so far. The last one by giraffhere seems to be the most elegant. however, ist not floor_date but round_date which will do the trick:
lubridate::round_date(x, "15 minutes")
You can use round
. The trick is to divide by 900 seconds (15 minutes * 60 seconds) before rounding and multiply by 900 afterwards:
a <-as.POSIXlt("2012-05-30 20:41:21 UTC")
b <-as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
b
[1] "2012-05-30 20:45:00 EDT"
To get only hour and minute, just use format
format(b,"%H:%M")
[1] "20:45"
as.character(format(b,"%H:%M"))
[1] "20:45"
Try this, which combines both requests and is based on looking at what round.POSIXt()
and trunc.POSIXt()
do.
myRound <- function (x, convert = TRUE) {
x <- as.POSIXlt(x)
mins <- x$min
mult <- mins %/% 15
remain <- mins %% 15
if(remain > 7L || (remain == 7L && x$sec > 29))
mult <- mult + 1
if(mult > 3) {
x$min <- 0
x <- x + 3600
} else {
x$min <- 15 * mult
}
x <- trunc.POSIXt(x, units = "mins")
if(convert) {
x <- format(x, format = "%H:%M")
}
x
}
This gives:
> tmp <- as.POSIXct("2012-05-30 20:41:21 UTC")
> myRound(tmp)
[1] "20:45"
> myRound(tmp, convert = FALSE)
[1] "2012-05-30 20:45:00 BST"
> tmp2 <- as.POSIXct("2012-05-30 20:55:21 UTC")
> myRound(tmp2)
[1] "21:00"
> myRound(tmp2, convert = FALSE)
[1] "2012-05-30 21:00:00 BST"
Using IDate
and ITime
classes from data.table
and a IPeriod
class (just developed) I was able to get more scalable solution.
Only shhhhimhuntingrabbits and PLapointe answer the question in terms of nearest. xts
solution only rounds using ceiling, my IPeriod
solution allows to specify ceiling or floor.
To get top performance you would need to keep your data in IDate
and ITime
classes. As seen on benchmark it is cheap to produce POSIXct
from IDate/ITime/IPeriod
. Below benchmark of some 22M timestamp:
# install only if you don't have
install.packages(c("microbenchmarkCore","data.table"),
repos = c("https://olafmersmann.github.io/drat",
"https://jangorecki.github.io/drat/iperiod"))
library(microbenchmarkCore)
library(data.table) # iunit branch
library(xts)
Sys.setenv(TZ="UTC")
## some source data: download and unzip csv
# "http://api.bitcoincharts.com/v1/csv/btceUSD.csv.gz"
# below benchmark on btceUSD.csv.gz 11-Oct-2015 11:35 133664801
system.nanotime(dt <- fread(".btceUSD.csv"))
# Read 21931266 rows and 3 (of 3) columns from 0.878 GB file in 00:00:10
# user system elapsed
# NA NA 9.048991
# take the timestamp only
x = as.POSIXct(dt[[1L]], tz="UTC", origin="1970-01-01")
# functions
shhhhi <- function(your.time){
strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900
}
PLapointe <- function(a){
as.POSIXlt(round(as.double(a)/(15*60))*(15*60),origin=(as.POSIXlt('1970-01-01')))
}
# myRound - not vectorized
# compare results
all.equal(
format(shhhhi(x),"%H:%M"),
format(PLapointe(x),"%H:%M")
)
# [1] TRUE
all.equal(
format(align.time(x, n = 60*15),"%H:%M"),
format(periodize(x, "mins", 15),"%H:%M")
)
# [1] TRUE
# IPeriod native input are IDate and ITime - will be tested too
idt <- IDateTime(x)
idate <- idt$idate
itime <- idt$itime
microbenchmark(times = 10L,
shhhhi(x),
PLapointe(x),
xts = align.time(x, 15*60),
posix_ip_posix = as.POSIXct(periodize(x, "mins", 15), tz="UTC"),
posix_ip = periodize(x, "mins", 15),
ip_posix = as.POSIXct(periodize(idate, itime, "mins", 15), tz="UTC"),
ip = periodize(idate, itime, "mins", 15))
# Unit: microseconds
# expr min lq mean median uq max neval
# shhhhi(x) 960819.810 984970.363 1127272.6812 1167512.2765 1201770.895 1243706.235 10
# PLapointe(x) 2322929.313 2440263.122 2617210.4264 2597772.9825 2792936.774 2981499.356 10
# xts 453409.222 525738.163 581139.6768 546300.9395 677077.650 767609.155 10
# posix_ip_posix 3314609.993 3499220.920 3641219.0876 3586822.9150 3654548.885 4457614.174 10
# posix_ip 3010316.462 3066736.299 3157777.2361 3133693.0655 3234307.549 3401388.800 10
# ip_posix 335.741 380.696 513.7420 543.3425 630.020 663.385 10
# ip 98.031 151.471 207.7404 231.8200 262.037 278.789 10
IDate
and ITime
successfully scales not only in this particular task. Both types, same as IPeriod
, are integer based. I would assume they will also scale nice on join or grouping by datetime fields.
Online manual: https://jangorecki.github.io/drat/iperiod/
something like
format(strptime("1970-01-01", "%Y-%m-%d", tz="UTC") + round(as.numeric(your.time)/900)*900,"%H:%M")
would work
You can use the align.time
function in the xts package to handle the rounding, then format
to return a string of "HH:MM":
R> library(xts)
R> p <- as.POSIXct("2012-05-30 20:41:21", tz="UTC")
R> a <- align.time(p, n=60*15) # n is in seconds
R> format(a, "%H:%M")
[1] "20:45"