Ntlk & Python, plotting ROC curve

问题

I am using nltk with Python and I would like to plot the ROC curve of my classifier (Naive Bayes). Is there any function for plotting it or should I have to track the True Positive rate and False Positive rate ?

It would be great if someone would point me to some code already doing it...

Thanks.

回答1:

PyROC looks simple enough: tutorial, source code

This is how it would work with the NLTK naive bayes classifier:

# class labels are 0 and 1
labeled_data = [
    (1, featureset_1),
    (0, featureset_2),
    (1, featureset_3),
    # ...
]

# naive_bayes is your already trained classifier,
# preferrably not on the data you're testing on :)

from pyroc import ROCData

roc_data = ROCData(
    (label, naive_bayes.prob_classify(featureset).prob(1))
    for label, featureset
    in labeled_data
)
roc_data.plot()

Edits:

ROC is for binary classifiers only. If you have three classes, you can measure the performance of your positive and negative class separately (by counting the other two classes as 0, like you proposed).
The library expects the output of a decision function as the second value of each tuple. It then tries all possible thresholds, e.g. f(x) >= 0.8 => classify as 1, and plots a point for each threshold (that's why you get a curve in the end). So if your classifier guesses class 0, you actually want a value closer to zero. That's why I proposed .prob(1)

来源：https://stackoverflow.com/questions/8192455/ntlk-python-plotting-roc-curve

标签

python

nlp

machine-learning

nltk

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!