Replace missing values (NA) with most recent non-NA by group

前端 未结 7 910
南旧
南旧 2020-11-22 05:42

I would like to solve the following problem with dplyr. Preferable with one of the window-functions. I have a data frame with houses and buying prices. The following is an e

相关标签:
7条回答
  • 2020-11-22 06:17

    A dplyr and imputeTS combination.

    library(dplyr)
    library(imputeTS)
    df %>% group_by(houseID) %>% 
    mutate(price = na.locf(price, na.remaining="keep"))  
    

    You could also replace na.locf with more advanced missing data replacement (imputation) functions from imputeTS. For example na.interpolation or na.kalman. For this just replace na.locf with the name of the function you like.

    0 讨论(0)
提交回复
热议问题