I have date ranges that are grouped by two variables (id
and type
) that are currently stored in a data frame called data
. My goal is to ex
1) by Here is a three line answer using by
from the base of R. First we convert the dates to "Date"
class giving data2
. Then we apply f
which does the real work over each row and finally we rbind
the resulting rows together:
data2 <- transform(data, from = as.Date(from), to = as.Date(to))
f <- function(x) with(x, data.frame(id, type, date = seq(from, to, by = "day")))
do.call("rbind", by(data, 1:nrow(data), f))
2) data.table Using the same data2
with data.table we do it like this:
library(data.table)
dt <- data.table(data2)
dt[, list(id, type, date = seq(from, to, by = "day")), by = 1:nrow(dt)]
2a) data.table or alternately this where dt
is from (2) and f
is from (1):
dt[, f(.SD), by = 1:nrow(dt)]
3) dplyr with dplyr it gives a warning but otherwise works where data2
and f
are from (1):
data2 %>% rowwise() %>% do(f(.))
UPDATES Some improvements.
Here is one way to perform such a transformation using base functions
do.call(rbind,Map(function(id,type,from,to) {
dts <- seq(from=from, to=to, by="1 day")
dur <- length(dts)
data.frame(
id=rep(id, dur),
type=rep(type,dur),
date=dts
)
}, data$id, data$type, data$from, data$to))
And the first chunck of the output is
id type date
1 1 a 2009-02-21 02:00:00
2 1 a 2009-02-22 02:00:00
3 1 a 2009-02-23 02:00:00
4 1 a 2009-02-25 02:00:00
5 1 b 2009-02-25 02:00:00
6 1 b 2009-02-26 02:00:00
7 1 c 2009-02-26 02:00:00
8 1 c 2009-02-27 02:00:00
9 1 c 2009-02-28 02:00:00
10 1 c 2009-03-01 02:00:00
11 1 b 2009-02-26 02:00:00