问题
I am using nltk with Python and I would like to plot the ROC curve of my classifier (Naive Bayes). Is there any function for plotting it or should I have to track the True Positive rate and False Positive rate ?
It would be great if someone would point me to some code already doing it...
Thanks.
回答1:
PyROC looks simple enough: tutorial, source code
This is how it would work with the NLTK naive bayes classifier:
# class labels are 0 and 1
labeled_data = [
(1, featureset_1),
(0, featureset_2),
(1, featureset_3),
# ...
]
# naive_bayes is your already trained classifier,
# preferrably not on the data you're testing on :)
from pyroc import ROCData
roc_data = ROCData(
(label, naive_bayes.prob_classify(featureset).prob(1))
for label, featureset
in labeled_data
)
roc_data.plot()
Edits:
- ROC is for binary classifiers only. If you have three classes, you can measure the performance of your positive and negative class separately (by counting the other two classes as 0, like you proposed).
- The library expects the output of a decision function as the second value of each tuple. It then tries all possible thresholds, e.g. f(x) >= 0.8 => classify as 1, and plots a point for each threshold (that's why you get a curve in the end). So if your classifier guesses class 0, you actually want a value closer to zero. That's why I proposed
.prob(1)
来源:https://stackoverflow.com/questions/8192455/ntlk-python-plotting-roc-curve