Scikit-learn confusion matrix

前端 未结 4 1377
情书的邮戳
情书的邮戳 2021-02-01 04:15

I can\'t figure out if I\'ve setup my binary classification problem correctly. I labeled the positive class 1 and the negative 0. However It is my understanding that by default

4条回答
  •  醉话见心
    2021-02-01 04:39

    Supporting Answer:

    When drawing the confusion matrix values using sklearn.metrics, be aware that the order of the values are

    [ True Negative False positive] [ False Negative True Positive ]

    If you interpret the values wrong, say TP for TN, your accuracies and AUC_ROC will more or less match, but your precision, recall, sensitivity, and f1-score will take a hit and you will end up with completely different metrics. This will result in you making a false judgement of your model's performance.

    Do make sure to clearly identify what the 1 and 0 in your model represent. This heavily dictates the results of the confusion matrix.

    Experience:

    I was working on predicting fraud (binary supervised classification), where fraud was denoted by 1 and non-fraud by 0. My model was trained on a scaled up, perfectly balanced data set, hence during in-time testing, values of confusion matrix did not seem suspicious when my results were of the order [TP FP] [FN TN]

    Later, when I had to perform an out-of-time test on a new imbalanced test set, I realized that the above order of confusion matrix was wrong and different from the one mentioned on sklearn's documentation page which refers to the order as tn,fp,fn,tp. Plugging in the new order made me realize the blunder and what a difference it had caused in my judgement of the model's performance.

提交回复
热议问题