问题
I am trying to construct a confusion matrix without using the sklearn library. I am having trouble correctly forming the confusion matrix. Here's my code:
def comp_confmat():
currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2]
predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1]
cm = []
classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classes
for c1 in range(1,classes+1):#for every true class
counts = []
for c2 in range(1,classes+1):#for every predicted class
count = 0
for p in range(len(currentDataClass)):
if currentDataClass[p] == predictedClass[p]:
count += 1
counts.append(count)
cm.append(counts)
print(np.reshape(cm,(classes,classes)))
However this returns:
[[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]
[7 7 7 7 7]]
But I don't understand why each iteration results in 7 when I am reseting the count each time and it's looping through different values?
This is what I should be getting (using the sklearn's confusion_matrix function):
[[3 0 0 0 1]
[2 1 0 1 0]
[0 1 3 0 0]
[0 1 0 0 0]
[0 1 1 0 0]]
回答1:
In your innermost loop, there should be a case distinction: Currently this loop counts agreement, but you only want that if actually c1 == c2
.
Here's another way, using nested list comprehensions:
currentDataClass = [1,3,3,2,5,5,3,2,1,4,3,2,1,1,2]
predictedClass = [1,2,3,4,2,3,3,2,1,2,3,1,5,1,1]
classes = int(max(currentDataClass) - min(currentDataClass)) + 1 #find number of classes
counts = [[sum([(currentDataClass[i] == true_class) and (predictedClass[i] == pred_class)
for i in range(len(currentDataClass))])
for pred_class in range(1, classes + 1)]
for true_class in range(1, classes + 1)]
counts
[[3, 0, 0, 0, 1],
[2, 1, 0, 1, 0],
[0, 1, 3, 0, 0],
[0, 1, 0, 0, 0],
[0, 1, 1, 0, 0]]
回答2:
import numpy as np
currentDataClass = [1, 3, 3, 2, 5, 5, 3, 2, 1, 4, 3, 2, 1, 1, 2]
predictedClass = [1, 2, 3, 4, 2, 3, 3, 2, 1, 2, 3, 1, 5, 1, 1]
def comp_confmat(actual, predicted):
classes = np.unique(actual) # extract the different classes
matrix = np.zeros((len(classes), len(classes))) # initialize the confusion matrix with zeros
for i in range(len(classes)):
for j in range(len(classes)):
matrix[i, j] = np.sum((actual == classes[i]) & (predicted == classes[j]))
return matrix
comp_confmat(currentDataClass, predictedClass)
array([[3., 0., 0., 0., 1.],
[2., 1., 0., 1., 0.],
[0., 1., 3., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 1., 1., 0., 0.]])
回答3:
Here is my solution using numpy and pandas:
import numpy as np
import pandas as pd
currentDataClass = [1, 3, 3, 2, 5, 5, 3, 2, 1, 4, 3, 2, 1, 1, 2]
predictedClass = [1, 2, 3, 4, 2, 3, 3, 2, 1, 2, 3, 1, 5, 1, 1]
classes = set(currentDataClass)
number_of_classes = len(classes)
conf_matrix = pd.DataFrame(
np.zeros((number_of_classes, number_of_classes),dtype=int),
index=classes,
columns=classes)
for i, j in zip(currentDataClass,predictedClass):
conf_matrix.loc[i, j] += 1
print(conf_matrix.values)
[[3 0 0 0 1]
[2 1 0 1 0]
[0 1 3 0 0]
[0 1 0 0 0]
[0 1 1 0 0]]
来源:https://stackoverflow.com/questions/61193476/constructing-a-confusion-matrix-from-data-without-sklearn