问题
I am working on sentence category detection Problem. Where each sentence can belong to multiple categories for Example:
"It has great sushi and even better service."
True Label: [[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1.]]
Pred Label: [[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1.]]
Correct Prediction!
Output: ['FOOD#QUALITY' 'SERVICE#GENERAL']
I have implemented a classifier that can predict multiple categories. I have total 587 sentences that belongs to multiple categories. I have calculated the accuracy scores in two ways:
If all labels of an example predicted or not?
code:
print "<------------ZERO one ERROR------------>"
print "Total Examples:",(truePred+falsePred) ,"True Pred:",truePred, "False Pred:", falsePred, "Accuracy:", truePred/(truePred+falsePred)
Output:
<------------ZERO one ERROR------------>
Total Examples: 587 True Pred: 353 False Pred: 234 Accuracy: 0.60136286201
How many labels are correctly predicted for all examples?
code:
print "\n<------------Correct and inccorrect predictions------------>"
print "Total Labels:",len(total[0]),"Predicted Labels:", corrPred, "Accuracy:", corrPred/len(total[0])
Output:
<------------Correct and inccorrect predictions------------>
Total Labels: 743 Predicted Labels: 522 Accuracy: 0.702557200538
Problem: These are all the accuracy scores calculated by comparing predicted scores with ground truth labels. But i want to calculate F1 score (using micro averaging), precision and recall as well. I have ground truth labels and i need to match my predictions with those ground truth labels. But, i don't know how do i tackle such type of multi-label classification problem. Can i use scikit-learn or any other libraries in python?
回答1:
Have a look at the metrics already available with sklearn and understand them. They are not available for multiclass multilabel classification so you can write your own or map your categories to labels.
[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1.] => 0
[ 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.] => 1
...
You have to understand what this solution implies : if an example have 4 classes and if you have 3 out of the 4 correctly predicted, using an accuracy_score will be the same as a prediction of 0 out of 4 correctly predicted.
It is an error.
Here an example
>>> from sklearn.metrics import accuracy_score
>>> y_pred = [0, 2, 1, 3]
>>> y_true = [0, 1, 2, 3]
>>> accuracy_score(y_true, y_pred)
0.5
回答2:
I made matrix of predicted labels predictedlabel
and i already had correct categories to compare my results in y_test
. So, i tried the following code:
from sklearn.metrics import classification_report
from sklearn.metrics import f1_score
from sklearn.metrics import roc_auc_score
print "Classification report: \n", (classification_report(y_test, predictedlabel))
print "F1 micro averaging:",(f1_score(y_test, predictedlabel, average='micro'))
print "ROC: ",(roc_auc_score(y_test, predictedlabel))
and i got the following results:
precision recall f1-score support
0 0.74 0.93 0.82 57
1 0.00 0.00 0.00 3
2 0.57 0.38 0.46 21
3 0.75 0.75 0.75 12
4 0.44 0.68 0.54 22
5 0.81 0.93 0.87 226
6 0.57 0.54 0.55 48
7 0.71 0.38 0.50 13
8 0.70 0.72 0.71 142
9 0.33 0.33 0.33 33
10 0.42 0.52 0.47 21
11 0.80 0.91 0.85 145
av/total 0.71 0.78 0.74 743
F1 micro averaging: 0.746153846154
ROC: 0.77407943841
So, i am calculating my results in this way!
来源:https://stackoverflow.com/questions/36620578/how-to-calculate-f1-measure-in-multi-label-classification