问题
My prediction results look like this
TestArray
[1,0,0,0,1,0,1,...,1,0,1,1],
[1,0,1,0,0,1,0,...,0,1,1,1],
[0,1,1,1,1,1,0,...,0,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],
PredictionArray
[1,0,0,0,0,1,1,...,1,0,1,1],
[1,0,1,1,1,1,0,...,1,0,0,1],
[0,1,0,1,0,0,0,...,1,1,1,1],
.
.
.
[1,1,0,1,1,0,1,...,0,1,1,1],
this is the size of the arrays that I have
TestArray.shape
Out[159]: (200, 24)
PredictionArray.shape
Out[159]: (200, 24)
I want to get TP, TN, FP and FN for these arrays
I tried this code
cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
but the results I got
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
125 5 0 1
I checked the shape of cm
cm.shape
Out[168]: (17, 17)
125 + 5 + 0 + 1 = 131 and that does not equal the number of columns I have which is 200
I am expecting to have 200 as each cell in the array suppose to be TF, TN, FP, TP so the total should be 200
How to fix that?
Here is an example of the problem
import numpy as np
from sklearn.metrics import confusion_matrix
TestArray = np.array(
[
[1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,0,0,1],
[0,1,1,0,1,0,0,1,0,0,0,1,0,1,0,1,1,0,1,1],
[1,0,1,1,1,1,0,0,1,1,1,1,0,0,1,0,0,0,0,0],
[0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,1,1,1],
[0,0,0,0,1,1,0,1,1,0,0,1,0,1,1,0,1,1,1,1],
[1,0,0,1,1,1,0,1,1,0,1,0,0,1,1,0,0,1,0,0],
[1,1,1,0,0,1,0,0,1,1,0,1,0,1,1,1,1,1,0,1],
[0,0,0,1,0,0,1,0,1,0,1,0,0,0,0,1,0,0,1,1],
[1,0,1,0,0,0,0,1,0,1,0,1,0,0,0,0,1,0,1,0],
[1,1,0,1,1,1,1,0,1,0,1,0,1,1,1,1,0,1,0,0]
])
TestArray.shape
PredictionArray = np.array(
[
[0,0,0,1,1,1,1,0,0,0,1,0,0,0,1,0,1,0,1,1],
[0,1,0,0,1,0,1,1,0,0,0,1,1,0,0,1,1,0,0,1],
[1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0],
[0,1,0,1,0,0,1,0,0,1,0,1,1,0,0,1,0,0,1,1],
[0,0,1,0,0,1,0,1,1,1,0,1,1,1,0,0,1,1,0,1],
[1,0,0,1,0,1,1,1,1,0,0,1,0,1,1,1,0,1,1,0],
[1,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,0,1,0,1],
[0,0,0,0,0,0,0,1,1,0,1,0,0,1,0,1,1,0,1,1],
[1,0,1,1,0,0,0,1,0,1,0,1,1,1,1,0,0,0,1,0],
[1,1,0,1,1,1,1,1,1,0,1,0,0,0,0,1,1,1,0,0]
])
PredictionArray.shape
cm=confusion_matrix(TestArray.argmax(axis=1), PredictionArray.argmax(axis=1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN,FN,TP,FP)
The output is
5 0 2 0
= 5+0+2+0 = 7 !!
There are 20 columns in the array and 10 rows
but cm gives to total of 7!!
回答1:
When using np.argmax
the matrices that you input sklearn.metrics.confusion_matrix
isn't binary anymore, as np.argmax
returns the index of the first occuring maximum value. In this case along axis=1
.
You don't get the good'ol true-positives / hits, true-negatives / correct-rejections, etc., when your prediction isn't binary.
You should find that sum(sum(cm))
indeed equals 200.
If each index of the arrays represents an individual prediction, i.e. you are trying to get TP/TN/FP/FN for a total of 200 (10 * 20
) predictions with the outcome of either 0
or 1
for each prediction, then you can obtain TP/TN/FP/FN by flattening the arrays before parsing them to confusion_matrix
. That is to say, you could reshape TestArray
and PreditionArry
to (200,)
, e.g.:
cm = confusion_matrix(TestArray.reshape(-1), PredictionArray.reshape(-1))
TN = cm[0][0]
FN = cm[1][0]
TP = cm[1][1]
FP = cm[0][1]
print(TN, FN, TP, FP, '=', TN + FN + TP + FP)
Which returns
74 28 73 25 = 200
来源:https://stackoverflow.com/questions/60964473/arrays-tp-tn-fp-and-fn-in-python