问题
Now i use optbinning module to binning all logstic regression modeling varible. however optbinning module need to use only one variable ,such as
variable = "REGION_POPULATION_RELATIVE"
x = df[variable].values
y = df.TARGET.values
from optbinning import OptimalBinning
optb = OptimalBinning(name=variable, dtype="numerical", solver="ls", max_n_prebins=100,
min_prebin_size=0.001, time_limit=50)
optb.fit(x, y)
how can i use loop to get binning result for all variable ? i try to codeing
variable_names = train_validation_valid_nonstring_nondatetype_categoryencoders.keys()
for i in variable_names:
optb = OptimalBinning(name=i,dtype="numerical", solver="cp")
optb.fit(x_category_encoders_target, y_category_encoders)
but get error "operands could not be broadcast together with shapes (52803,602) (52803,) " i get dataframe including hundreds variable, it will be a huge project if one by one calculation. please help me,thanks.
回答1:
To compute the optimal binning of all variables in a dataset, you can use the BinningProcess class.
tutorials:http://gnpalencia.org/optbinning/tutorials/tutorial_binning_process_telco_churn.html
documentation: http://gnpalencia.org/optbinning/binning_process.html
from optbinning import BinningProcess
binning_process = BinningProcess(variable_names=variable_names)
binning_process.fit(df[variable_names], df[target])
Then, you can retrieve information for each variable or a given list of variables using method get_binned_variable
. For example:
for variable in variable_names:
optb = binning_process.get_binned_variable(name=variable)
optb.binning_table.build()
optb.binning_table.plot()
optb.binning_table.analysis()
来源:https://stackoverflow.com/questions/62035217/how-can-i-call-optbinning-module-get-results-of-all-varible-binning