问题
I'm trying to understand the difference between RidgeClassifier and LogisticRegression in sklearn.linear_model
. I couldn't find it in the documentation.
I think I understand quite well what the LogisticRegression does.It computes the coefficients and intercept to minimise half of sum of squares of the coefficients + C times the binary cross-entropy loss
, where C is the regularisation parameter. I checked against a naive implementation from scratch, and results coincide.
Results of RidgeClassifier differ and I couldn't figure out, how the coefficients and intercept are computed there? Looking at the Github code, I'm not experienced enough to untangle it.
The reason why I'm asking is that I like the RidgeClassifier results -- it generalises a bit better to my problem. But before I use it, I would like to at least have an idea where does it come from.
Thanks for possible help.
回答1:
RidgeClassifier() works differently compared to LogisticRegression() with l2 penalty. The loss function for RidgeClassifier()
is not cross entropy.
RidgeClassifier()
uses Ridge() regression model in the following way to create a classifier:
Let us consider binary classification for simplicity.
Convert target variable into
+1
or-1
based on the class in which it belongs to.Build a
Ridge()
regression model (which is a regression model) to predict our target variable. The loss function isMSE + l2 penalty
If the
Ridge()
regression's prediction value (calculated based on decision_function() function) is greater than 0, then predict as positive class else negative class.
For multi-class classification:
Use LabelBinarizer() to create a multi-output regression scenario, and then train independent
Ridge()
regression models, one for each class (One-Vs-Rest modelling).Get prediction from each class's
Ridge()
regression model (a real number for each class) and then useargmax
to predict the class.
来源:https://stackoverflow.com/questions/53911663/what-does-sklearn-ridgeclassifier-do