问题
Can anyone explain how to interpret coefficientMatrix
, interceptVector
, Confusion matrix
of a multinomial logistic regression
.
According to Spark documentation:
Multiclass classification is supported via multinomial logistic (softmax) regression. In multinomial logistic regression, the algorithm produces K sets of coefficients, or a matrix of dimension K×J where K is the number of outcome classes and J is the number of features. If the algorithm is fit with an intercept term then a length K vector of intercepts is available.
I turned an example using spark ml 2.3.0 and I got this result.
.
If I analyse what I get :
The coefficientMatrix
has dimension of 5 * 11
The interceptVector
has dimension of 5
If so,why the Confusion matrix
has a dimension of 4 * 4
?
Also, can anyone give an interpretation of coefficientMatrix
, interceptVector
?
Why I get negative coefficients ?
If 5 is the number of classes after classification, why I get 4 rows in the confusion matrix
?
EDIT
I forgot to mention that I am still beginner in machine learning and that my search in google didn't help, so maybe I get an Up Vote :)
回答1:
Regarding the 4x4 confusion matrix: I imagine that when you split your data into test and train, there were 5 classes present in your training set and only 4 classes present in your test set. This can easily happen if the distribution of your response variable is imbalanced.
You'll want to try to perform some stratified split between test and train prior to modeling. If you are working with pyspark, you may find this library helpful: https://github.com/databricks/spark-sklearn
Now regarding negative coefficients for a multi-class Logistic Regression: As you mentioned, your returned coefficientMatrix shape is 5x11. Spark generated five models via one-vs-all approach. The 1st model corresponds to the model where the positive class is the 1st label and the negative class is composed of all other labels. Lets say the 1st coefficient for this model is -2.23. In order to interpret this coefficient we take the exponential of -2.23 which is (approx) 0.10. Interpretation here: 'With one unit increase of 1st feature we expect a reduced odds of the positive label by 90%'
来源:https://stackoverflow.com/questions/50784833/interpreting-coefficientmatrix-interceptvector-and-confusion-matrix-on-multinom