replace missing values in categorical data

后端 未结 3 1285
孤独总比滥情好
孤独总比滥情好 2021-01-27 00:00

Let\'s suppose I have a column with categorical data \"red\" \"green\" \"blue\" and empty cells

red
green
red
blue
NaN

I\'m sure that the NaN b

3条回答
  •  暖寄归人
    2021-01-27 00:28

    In addition to Lan's answer's approach, which seems most commonly used, you can use something based on matrix factorization. For example there is a variant of Generalized Low Rank Models that can impute such data, just as probabilistic matrix factorization is used to impute continuous data.

    GLRMs can be used from H2O which provides bindings for both Python and R.

提交回复
热议问题