问题
I have irregular time series data frame with time
(seconds) and value
columns. I want to add another column, value_2
where values are lead by delay
seconds. So value_2
at time t
equals to value
at time t + delay
or right after that.
ts=data.frame(
time=c(1,2,3,5,8,10,11,15,20,23),
value=c(1,2,3,4,5,6,7,8,9,10)
)
ts_with_delayed_value <- add_delayed_value(ts, "value", 2, "time")
> ts_with_delayed_value
time value value_2
1 1 1 3
2 2 2 4
3 3 3 4
4 5 4 5
5 8 5 6
6 10 6 8
7 11 7 8
8 15 8 9
9 20 9 10
10 23 10 10
I have my own version of this function add_delayed_value
, here it is:
add_delayed_value <- function(data, colname, delay, colname_time) {
colname_delayed <- paste(colname, sprintf("%d", delay), sep="_")
data[colname_delayed] <- NaN
for (i in 1:nrow(data)) {
time_delayed <- data[i, colname_time] + delay
value_delayed <- data[data[colname_time] >= time_delayed, colname][1]
if (is.na(value_delayed)) {
value_delayed <- data[i, colname]
}
data[i, colname_delayed] <- value_delayed
}
return(data)
}
Is there a way to vectorize this routine to avoid the slow loop?
I'm quite new to R, so this code probably has lots of issues. What can be improved about it?
回答1:
You could try:
library(dplyr)
library(zoo)
na.locf(ts$value[sapply(ts$time, function(x) min(which(ts$time - x >=2 )))])
[1] 3 4 4 5 6 8 8 9 10 10
回答2:
This should work for your data. If you want to make a general function, you'll have to play around with lazyeval, which honestly might not be worth it.
library(dplyr)
library(zoo)
carry_back = . %>% na.locf(na.rm = TRUE, fromLast = FALSE)
data_frame(time =
with(ts,
seq(first(time),
last(time) ) ) ) %>%
left_join(ts) %>%
transmute(value_2 = carry_back(value),
time = time - delay) %>%
right_join(ts) %>%
mutate(value_2 =
value_2 %>%
is.na %>%
ifelse(last(value), value_2) )
回答3:
What you want is not clear, give a pseudo code or a formula. It looks like this is what you want... From what I understand from you the last value should be NA
library(data.table)
setDT(ts,key='time')
ts_delayed = ts[,.(time_delayed=time+2)]
setkey(ts_delayed,time_delayed)
ts[ts_delayed,roll=-Inf]
来源:https://stackoverflow.com/questions/36753810/r-lag-lead-irregular-time-series-data