Python: Loaded NLTK Classifier not working

我的未来我决定 提交于 2019-12-10 07:36:02

问题


I'm trying to train a NLTK classifier for sentiment analysis and then save the classifier using pickle. The freshly trained classifier works fine. However, if I load a saved classifier the classifier will either output 'positive', or 'negative' for ALL examples.

I'm saving the classifier using

classifier = nltk.NaiveBayesClassifier.train(training_set)
classifier.classify(words_in_tweet)
f = open('classifier.pickle', 'wb')
pickle.dump(classifier, f)
f.close()

and loading the classifier using

f = open('classifier.pickle', 'rb')
classifier = pickle.load(f)
f.close()
classifier.classify(words_in_tweet)

I'm not getting any errors. Any idea what the problem could be, or how to debug this correctly?


回答1:


The most likely place a pickled classifier can go wrong is with the feature extraction function. This must be used to generate the feature vectors that the classifier works with.

The NaiveBayesClassifier expects feature vectors for both training and classification; your code looks as if you passed the raw words to the classifier instead (but presumably only after unpickling, otherwise you wouldn't get different behavior before and after unpickling). You should store the feature extraction code in a separate file, and import it in both the training and the classifying (or testing) script.

I doubt this applies to the OP, but some NLTK classifiers take the feature extraction function as an argument to the constructor. When you have separate scripts for training and classifying, it can be tricky to ensure that the unpickled classifier successfully finds the same function. This is because of the way pickle works: pickling only saves data, not code. To get it to work, just put the extraction function in a separate file (module) that your scripts import. If you put in in the "main" script, pickle.load will look for it in the wrong place.



来源:https://stackoverflow.com/questions/36727005/python-loaded-nltk-classifier-not-working

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!