Fill missing value based on probability of occurrence

后端 未结 1 714
自闭症患者
自闭症患者 2021-01-22 02:01

This is what my data.table/dataframe looks lke

library(data.table)
dt <- fread(\'
   STATE     ZIP      
   PA        19333        
   PA        19327                 


        
1条回答
  •  南方客
    南方客 (楼主)
    2021-01-22 02:43

    Here's a way just using sample, wrapped up in a convenience function.

    sample_fill_na = function(x) {
        x_na = is.na(x)
        x[x_na] = sample(x[!x_na], size = sum(x_na), replace = TRUE)
        return(x)
    }
    
    dt[, ZIP := sample_fill_na(ZIP), by = STATE]
    

    0 讨论(0)
提交回复
热议问题