How to extract False Positive, False Negative from a confusion matrix of multiclass classification

后端 未结 1 1679
再見小時候
再見小時候 2021-01-12 19:48

I am classifying mnist data using following Keras code. From confusion_matrix command of sklearn.metrics i got confusion matrix and from True

相关标签:
1条回答
  • 2021-01-12 20:26

    First of all, you have omissions in your code - in order to run, I needed to add the following commands:

    import keras
    (x_train, y_train), (x_test, y_test) = mnist.load_data()
    

    Having done that, and given the confusion matrix cm1:

    array([[ 965,    0,    1,    0,    0,    2,    6,    1,    5,    0],
           [   0, 1113,    4,    2,    0,    0,    3,    0,   13,    0],
           [   8,    0,  963,   14,    5,    1,    7,    8,   21,    5],
           [   0,    0,    3,  978,    0,    7,    0,    6,   12,    4],
           [   1,    0,    4,    0,  922,    0,    9,    3,    3,   40],
           [   4,    1,    1,   27,    0,  824,    6,    1,   20,    8],
           [  11,    3,    1,    1,    5,    6,  925,    0,    6,    0],
           [   2,    6,   17,    8,    2,    0,    1,  961,    2,   29],
           [   5,    1,    2,   13,    4,    6,    2,    6,  929,    6],
           [   6,    5,    0,    7,    5,    6,    1,    6,   10,  963]])
    

    here is how you can get the requested TP, FP, FN, TN per class:

    The True Positives are simply the diagonal elements:

    TruePositive = np.diag(cm1)
    TruePositive
    # array([ 965, 1113,  963,  978,  922,  824,  925,  961,  929,  963])
    

    The False Positives are the sum of the respective column, minus the diagonal element:

    FalsePositive = []
    for i in range(num_classes):
        FalsePositive.append(sum(cm1[:,i]) - cm1[i,i])
    FalsePositive
    # [37, 16, 33, 72, 21, 28, 35, 31, 92, 92]
    

    Similarly, the False Negatives are the sum of the respective row, minus the diagonal element:

    FalseNegative = []
    for i in range(num_classes):
        FalseNegative.append(sum(cm1[i,:]) - cm1[i,i])
    FalseNegative
    # [15, 22, 69, 32, 60, 68, 33, 67, 45, 46]
    

    Now, the True Negatives are a little trickier; let's first think what exactly a True Negative means, with respect to, say class 0: it means all the samples that have been correctly identified as not being 0. So, essentially what we should do is remove the corresponding row & column from the confusion matrix, and then sum up all the remaining elements:

    TrueNegative = []
    for i in range(num_classes):
        temp = np.delete(cm1, i, 0)   # delete ith row
        temp = np.delete(temp, i, 1)  # delete ith column
        TrueNegative.append(sum(sum(temp)))
    TrueNegative
    # [8998, 8871, 9004, 8950, 9057, 9148, 9040, 9008, 8979, 8945]
    

    Let's make a sanity check: for each class, the sum of TP, FP, FN, and TN must be equal to the size of our test set (here 10,000): let's confirm that this is indeed the case:

    l = len(y_test)
    for i in range(num_classes):
        print(TruePositive[i] + FalsePositive[i] + FalseNegative[i] + TrueNegative[i] == l)
    

    The result is

    True
    True
    True
    True
    True
    True
    True
    True
    True
    True
    
    0 讨论(0)
提交回复
热议问题