I have a data frame that looks like this:
id date
1001 2012-10-11
1005 2013-02-20
1005 2012-11-21
1005 2014-03-14
1003 2013-10-25
1003 201
Using data.table
package you could try the following (though it doesn't preserve the order). Assuming df
is your data set
library(data.table)
setkey(setDT(df)[, date := as.Date(date)], id, date) # If `date` is already of `Date` class you can skip the `as.Date` part
df[, no_of_days := c(NA, diff(date)) , by = id][]
# id date no_of_days
# 1: 1001 2012-10-11 NA
# 2: 1003 2013-10-25 NA
# 3: 1003 2013-11-30 36
# 4: 1005 2012-11-21 NA
# 5: 1005 2013-02-20 91
# 6: 1005 2014-03-14 387
Or (as @Arun suggesting) you can preserve the order by using order
instead of setkey
setDT(df)[, date := as.Date(date)][order(id, date),
no := c(NA, diff(date)), by = id][]
Could as well try dplyr
library(dplyr)
df %>%
mutate(date = as.Date(date)) %>%
arrange(id, date) %>%
group_by(id) %>%
mutate(no_of_days = c(NA, diff(date)))
Or using ave
(similar to @David Arenburg's approach)
indx <- with(df, order(id, date))
df1 <- transform(df[indx,], no_of_days=ave(as.numeric(date), id,
FUN= function(x) c(NA, diff(x))))[order(indx),]
df1
# id date no_of_days
#1 1001 2012-10-11 NA
#2 1005 2013-02-20 91
#3 1005 2012-11-21 NA
#4 1005 2014-03-14 387
#5 1003 2013-10-25 NA
#6 1003 2013-11-30 36
df <- structure(list(id = c(1001L, 1005L, 1005L, 1005L, 1003L, 1003L
), date = structure(c(15624, 15756, 15665, 16143, 16003, 16039
), class = "Date")), .Names = c("id", "date"), row.names = c(NA,
-6L), class = "data.frame")