Linear Interpolation using dplyr

后端 未结 2 1600
谎友^
谎友^ 2020-12-17 20:54

I\'m trying to use the na.approx() function from the zoo library (in conjunction with xts) to interpolate missing values from repeated

相关标签:
2条回答
  • 2020-12-17 21:40

    The solution I've gone with is based on the first comment from @docendodiscimus

    Rather than attempt to create a new data frame as I'd been doing this approach simply adds columns to the existing data frame by taking advantage of dplyr's mutate() function.

    My code is now...

    df %>%
      group_by(variable) %>%
        arrange(variable, event.date) %>%
          mutate(ip.value = na.approx(value, maxgap = 4, rule = 2))
    

    The maxgap allows upto four consecutive NA's, whilst the rule option allows extrapolation into the flanking time points.

    0 讨论(0)
  • 2020-12-17 21:48

    Use the approx() function for linear-interpolation:

    df %>%
      group_by(variable) %>%
        arrange(variable, event.date) %>%
        mutate(time=seq(1,n())) %>%
          mutate(ip.value=approx(time,value,time)$y) %>%
          select(-time)
    

    or the spline function for non-linear interpolation:

    df %>%
      group_by(variable) %>%
        arrange(variable, event.date) %>%
        mutate(time=seq(1,n())) %>%
          mutate(ip.value=spline(time,value ,n=n())$y) %>%
          select(-time)
    
    0 讨论(0)
提交回复
热议问题