max_value and min_value for each column in scikit IterativeImputer

问题

I have this data set with 78 columns and 5707 rows. Almost every column has missing values and I would like to impute them with IterativeImputer. If I understood it correctly, it will make a "smarter" imputation on each column based on the information from other columns.

However, when imputing, I do not want the imputed values to be less than the observed minimum or more than the observed maximum. I realize there are max_value and min_value parameters, but I do not want to impose a "global" limit to the imputations, instead, I want each column to have its own max_value and min_value (which is the already observed maximum and minimum values). Because otherwise, the values in the columns do not make sense (negative values for headcounts, negative values for rates, etc.)

Is there a way to implement that?

回答1:

So if you want to set max and min different for each column then you can go in a loop and in each iteration select the column using sklearn.compose.make_column_selector or sklearn.compose.make_column_transformer and then apply iterative imputer to that column giving max and min of that column as parameter.

来源：https://stackoverflow.com/questions/60228714/max-value-and-min-value-for-each-column-in-scikit-iterativeimputer

标签

python

pandas

scikit-learn

sklearn-pandas

imputation

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!