Is there a built-in way for getting accuracy scores for each class separatetly? I know in sklearn we can get overall accuracy by using metric.accuracy_score
. Is
You can code it by yourself : the accuracy is nothing more than the ratio between the well classified samples (true positives and true negatives) and the total number of samples you have.
Then, for a given class, instead of considering all the samples, you only take into account those of your class.
You can then try this: Let's first define a handy function.
def indices(l, val):
retval = []
last = 0
while val in l[last:]:
i = l[last:].index(val)
retval.append(last + i)
last += i + 1
return retval
The function above will return the indices in the list l of a certain value val
def class_accuracy(y_pred, y_true, class):
index = indices(l, class)
y_pred, y_true = ypred[index], y_true[index]
tp = [1 for k in range(len(y_pred)) if y_true[k]==y_pred[k]]
tp = np.sum(tp)
return tp/float(len(y_pred))
The last function will return the in-class accuracy that you look for.
You can use sklearn's confusion matrix to get the accuracy
from sklearn.metrics import confusion_matrix
import numpy as np
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
#Get the confusion matrix
cm = confusion_matrix(y_true, y_pred)
#array([[1, 0, 0],
# [1, 0, 0],
# [0, 1, 2]])
#Now the normalize the diagonal entries
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
#array([[1. , 0. , 0. ],
# [1. , 0. , 0. ],
# [0. , 0.33333333, 0.66666667]])
#The diagonal entries are the accuracies of each class
cm.diagonal()
#array([1. , 0. , 0.66666667])
References
from sklearn.metrics import confusion_matrix
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
matrix = confusion_matrix(y_true, y_pred)
matrix.diagonal()/matrix.sum(axis=1)
Your question makes no sense. Accuracy is a global measure, and there is no such thing as class-wise accuracy. The suggestions to normalize by true cases (rows) yields something called true-positive rate, sensitivity or recall, depending on the context. Likewise, if you normalize by prediction (columns), it's called precision or positive predictive value.
The question is misleading. Accuracy scores for each class equal the overall accuracy score. Consider the confusion matrix:
from sklearn.metrics import confusion_matrix
import numpy as np
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
#Get the confusion matrix
cm = confusion_matrix(y_true, y_pred)
print(cm)
This gives you:
[[1 0 0]
[1 0 0]
[0 1 2]]
Accuracy is calculated as the proportion of correctly classified samples to all samples:
accuracy = (TP + TN) / (P + N)
Regarding the confusion matrix, the numerator (TP + TN) is the sum of the diagonal. The denominator is the sum of all cells. Both are the same for every class.
In my opinion, accuracy is generic term that has different dimensions, e.g. precision, recall, f1-score, (or even specificity, sensitivity), etc. that provide accuracy measures in different perspectives. Hence, the function 'classification_report' outputs a range of accuracy measures for each class. For instance, precision provides the proportion of accurately retrieved instances (i.e. true positives) with total number of instances (both true positives and false negatives) available in a particular class.