I\'m trying to use the na.approx()
function from the zoo
library (in conjunction with xts
) to interpolate missing values from repeated
The solution I've gone with is based on the first comment from @docendodiscimus
Rather than attempt to create a new data frame as I'd been doing this approach simply adds columns to the existing data frame by taking advantage of dplyr
's mutate()
function.
My code is now...
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(ip.value = na.approx(value, maxgap = 4, rule = 2))
The maxgap
allows upto four consecutive NA
's, whilst the rule
option allows extrapolation into the flanking time points.
Use the approx()
function for linear-interpolation:
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(time=seq(1,n())) %>%
mutate(ip.value=approx(time,value,time)$y) %>%
select(-time)
or the spline
function for non-linear interpolation:
df %>%
group_by(variable) %>%
arrange(variable, event.date) %>%
mutate(time=seq(1,n())) %>%
mutate(ip.value=spline(time,value ,n=n())$y) %>%
select(-time)