arrays TP, TN, FP and FN in Python

问题

My prediction results look like this

TestArray

[1,0,0,0,1,0,1,...,1,0,1,1],
[1,0,1,0,0,1,0,...,0,1,1,1],
[0,1,1,1,1,1,0,...,0,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

PredictionArray

[1,0,0,0,0,1,1,...,1,0,1,1],
[1,0,1,1,1,1,0,...,1,0,0,1],
[0,1,0,1,0,0,0,...,1,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],

this is the size of the arrays that I have

TestArray.shape

Out[159]: (200, 24)

PredictionArray.shape

Out[159]: (200, 24)

I want to get TP, TN, FP and FN for these arrays

I tried this code

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

but the results I got

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)

125 5 0 1

I checked the shape of cm

cm.shape

Out[168]: (17, 17)

125 + 5 + 0 + 1 = 131 and that does not equal the number of columns I have which is 200

I am expecting to have 200 as each cell in the array suppose to be TF, TN, FP, TP so the total should be 200

How to fix that?

Here is an example of the problem

import numpy as np
from sklearn.metrics import confusion_matrix


TestArray = np.array(
[
[1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,0,0,1],
[0,1,1,0,1,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1],
[1,0,1,1,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,1,1,1],
[0,0,0,0,1,1,0,1,1,0,0,1,0,1,1,0,1,1,1,1],
[1,0,0,1,1,1,0,1,1,0,1,0,0,1,1,0,0,1,0,0],
[1,1,1,0,0,1,0,0,1,1,0,1,0,1,1,1,1,1,0,1],
[0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,1],
[1,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,1,0],
[1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,0]
])

TestArray.shape



PredictionArray = np.array(
[
[0,0,0,1,1,1,1,0,0,0,1,0,0,0,1,0,1,0,1,1],
[0,1,0,0,1,0,1,1,0,0,0,1,1,0,0,1,1,0,0,1],
[1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0],
[0,1,0,1,0,0,1,0,0,1,0,1,1,0,0,1,0,0,1,1],
[0,0,1,0,0,1,0,1,1,1,0,1,1,1,0,0,1,1,0,1],
[1,0,0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,1,0],
[1,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,1,0,1],
[0,0,0,0,0,0,0,1,1,0,1,0,0,1,0,1,1,0,1,1],
[1,0,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0],
[1,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,0,0]
])

PredictionArray.shape

cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN,FN,TP,FP)

The output is

5 0 2 0

= 5+0+2+0 = 7 !!

There are 20 columns in the array and 10 rows

but cm gives to total of 7!!

回答1:

When using np.argmax the matrices that you input sklearn.metrics.confusion_matrix isn't binary anymore, as np.argmax returns the index of the first occuring maximum value. In this case along axis=1.

You don't get the good'ol true-positives / hits, true-negatives / correct-rejections, etc., when your prediction isn't binary.

You should find that sum(sum(cm)) indeed equals 200.

If each index of the arrays represents an individual prediction, i.e. you are trying to get TP/TN/FP/FN for a total of 200 (10 * 20) predictions with the outcome of either 0 or 1 for each prediction, then you can obtain TP/TN/FP/FN by flattening the arrays before parsing them to confusion_matrix. That is to say, you could reshape TestArray and PreditionArry to (200,), e.g.:

cm = confusion_matrix(TestArray.reshape(-1), PredictionArray.reshape(-1))

TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]

print(TN, FN, TP, FP, '=', TN + FN + TP + FP)

Which returns

74 28 73 25 = 200

来源：https://stackoverflow.com/questions/60964473/arrays-tp-tn-fp-and-fn-in-python

标签

python

arrays

confusion-matrix