How to implement TF_IDF feature weighting with Naive Bayes

谁说我不能喝 提交于 2020-01-01 17:28:00

问题


I'm trying to implement the naive Bayes classifier for sentiment analysis. I plan to use the TF-IDF weighting measure. I'm just a little stuck now. NB generally uses the word(feature) frequency to find the maximum likelihood. So how do I introduce the TF-IDF weighting measure in naive Bayes?


回答1:


You can visit the following blog shows in detail how do you calculate TFIDF.




回答2:


You use the TF-IDF weights as features/predictors in your statistical model. I suggest to use either gensim [1]or scikit-learn [2] to compute the weights, which you then pass to your Naive Bayes fitting procedure.

The scikit-learn 'working with text' tutorial [3] might also be of interest.

[1] http://scikit-learn.org/dev/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html

[2] http://radimrehurek.com/gensim/models/tfidfmodel.html

[3] http://scikit-learn.github.io/scikit-learn-tutorial/working_with_text_data.html



来源:https://stackoverflow.com/questions/6291546/how-to-implement-tf-idf-feature-weighting-with-naive-bayes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!