Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

前端 未结 16 1116
一生所求
一生所求 2020-12-02 04:25

My problem:

I have a dataset which is a large JSON file. I read it and store it in the trainList variable.

Next, I pre-process

相关标签:
16条回答
  • 2020-12-02 05:00

    For the multi-class case, everything you need can be found from the confusion matrix. For example, if your confusion matrix looks like this:

    Then what you're looking for, per class, can be found like this:

    Using pandas/numpy, you can do this for all classes at once like so:

    FP = confusion_matrix.sum(axis=0) - np.diag(confusion_matrix)  
    FN = confusion_matrix.sum(axis=1) - np.diag(confusion_matrix)
    TP = np.diag(confusion_matrix)
    TN = confusion_matrix.values.sum() - (FP + FN + TP)
    
    # Sensitivity, hit rate, recall, or true positive rate
    TPR = TP/(TP+FN)
    # Specificity or true negative rate
    TNR = TN/(TN+FP) 
    # Precision or positive predictive value
    PPV = TP/(TP+FP)
    # Negative predictive value
    NPV = TN/(TN+FN)
    # Fall out or false positive rate
    FPR = FP/(FP+TN)
    # False negative rate
    FNR = FN/(TP+FN)
    # False discovery rate
    FDR = FP/(TP+FP)
    
    # Overall accuracy
    ACC = (TP+TN)/(TP+FP+FN+TN)
    
    0 讨论(0)
  • 2020-12-02 05:06

    You can obtain all of the parameters from the confusion matrix. The structure of the confusion matrix(which is 2X2 matrix) is as follows (assuming the first index is related to the positive label, and the rows are related to the true labels):

    TP|FN
    FP|TN
    

    So

    TP = cm[0][0]
    FN = cm[0][1]
    FP = cm[1][0]
    TN = cm[1][1]
    

    More details at https://en.wikipedia.org/wiki/Confusion_matrix

    0 讨论(0)
  • 2020-12-02 05:07

    In scikit version 0.22, you can do it like this

    from sklearn.metrics import multilabel_confusion_matrix
    
    y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
    y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
    
    mcm = multilabel_confusion_matrix(y_true, y_pred,labels=["ant", "bird", "cat"])
    
    tn = mcm[:, 0, 0]
    tp = mcm[:, 1, 1]
    fn = mcm[:, 1, 0]
    fp = mcm[:, 0, 1]
    
    0 讨论(0)
  • 2020-12-02 05:08

    I think both of the answers are not fully correct. For example, suppose that we have the following arrays;
    y_actual = [1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0]

    y_predic = [1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0]

    If we compute the FP, FN, TP and TN values manually, they should be as follows:

    FP: 3 FN: 1 TP: 3 TN: 4

    However, if we use the first answer, results are given as follows:

    FP: 1 FN: 3 TP: 3 TN: 4

    They are not correct, because in the first answer, False Positive should be where actual is 0, but the predicted is 1, not the opposite. It is also same for False Negative.

    And, if we use the second answer, the results are computed as follows:

    FP: 3 FN: 1 TP: 4 TN: 3

    True Positive and True Negative numbers are not correct, they should be opposite.

    Am I correct with my computations? Please let me know if I am missing something.

    0 讨论(0)
  • 2020-12-02 05:09

    you can try sklearn.metrics.classification_report as below:

    import sklearn
    y_true = [1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0]
    y_pred = [1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0]
    
    print sklearn.metrics.classification_report(y_true, y_pred)
    

    output:

             precision    recall  f1-score   support
    
          0       0.80      0.57      0.67         7
          1       0.50      0.75      0.60         4
    
          avg / total       0.69      0.64      0.64        11
    
    0 讨论(0)
  • 2020-12-02 05:09

    Just in case some is looking for the same in MULTI-CLASS Example

    def perf_measure(y_actual, y_pred):
        class_id = set(y_actual).union(set(y_pred))
        TP = []
        FP = []
        TN = []
        FN = []
    
        for index ,_id in enumerate(class_id):
            TP.append(0)
            FP.append(0)
            TN.append(0)
            FN.append(0)
            for i in range(len(y_pred)):
                if y_actual[i] == y_pred[i] == _id:
                    TP[index] += 1
                if y_pred[i] == _id and y_actual[i] != y_pred[i]:
                    FP[index] += 1
                if y_actual[i] == y_pred[i] != _id:
                    TN[index] += 1
                if y_pred[i] != _id and y_actual[i] != y_pred[i]:
                    FN[index] += 1
    
    
        return class_id,TP, FP, TN, FN
    
    0 讨论(0)
提交回复
热议问题