naivebayes | 易学教程

Naive bayesian classifier - multiple decisions

阅读更多关于 Naive bayesian classifier - multiple decisions

问题 I need to know whether the Naive bayesian classifier can be used to generate multiple decisions. I couldn't find any examples which have any evidence in supporting multiple decisions. I'm new to this area. So, I'm bit confused. Actually I need to develop character recognition software. There I need to identify what the given character is. It seems the Bayesian classifier can be used to identify whether a character given is a particular character or not, but it cannot give any other

Multinomial Naive Bayes Classifier

阅读更多关于 Multinomial Naive Bayes Classifier

问题 I have been looking for a multinomial naive Bayes classifier on CRAN, and so far all I can come up with is the binomial implementation in package e1071 . Does anyone know of a package that has a multinomial Bayes classifier? 回答1: bnlearn not doing it for you? http://www.bnlearn.com/ Is on CRAN, and claims to implement "naive Bayes" network classifiers and "Discrete (multinomial) data sets are supported". 来源： https://stackoverflow.com/questions/8874058/multinomial-naive-bayes-classifier

Naive-bayes multinomial text classifier using Data frame in Scala Spark

阅读更多关于 Naive-bayes multinomial text classifier using Data frame in Scala Spark

问题 I am trying to build a NaiveBayes classifier, loading the data from database as DataFrame which contains (label, text). Here's the sample of data (multinomial label): label| feature| +-----+--------------------+ | 1|combusting prepar...| | 1|adhesives for ind...| | 1| | | 1| salt for preserving| | 1|auxiliary fluids ...| I have used following transformation for tokenization, stopword, n-gram, and hashTF : val selectedData = df.select("label", "feature") // Tokenize RDD val tokenizer = new

How to get feature Importance in naive bayes?

阅读更多关于 How to get feature Importance in naive bayes?

问题 I have a dataset of reviews which has a class label of positive/negative. I am applying Naive Bayes to that reviews dataset. Firstly, I am converting into Bag of words. Here sorted_data['Text'] is reviews and final_counts is a sparse matrix count_vect = CountVectorizer() final_counts = count_vect.fit_transform(sorted_data['Text'].values) I am splitting the data into train and test dataset. X_1, X_test, y_1, y_test = cross_validation.train_test_split(final_counts, labels, test_size=0.3, random

Semi-supervised Naive Bayes with NLTK [closed]

阅读更多关于 Semi-supervised Naive Bayes with NLTK [closed]

问题 This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center. Closed 7 years ago . I have built a semi-supervised version of NLTK's Naive Bayes in Python based on the EM (expectation-maximization algorithm). However, in some iterations

Any Naive Bayesian Classifier in python? [closed]

阅读更多关于 Any Naive Bayesian Classifier in python? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 5 years ago . I have tried the Orange Framework for Naive Bayesian classification. The methods are extremely unintuitive, and the documentation is extremely unorganized. Does anyone here have another framework to recommend? I use mostly NaiveBayesian for now. I was thinking of using nltk's NaiveClassification but then they

sklearn (Bad Input Shape) ValueError

阅读更多关于 sklearn (Bad Input Shape) ValueError

问题 I am new to the world of ML and sklearn. I tried to use GaussianNB on a dataset with X_train[2500,800] , Y_train[2500,8] . from sklearn.naive_bayes import GaussianNB clf = GaussianNB() clf.fit(X, Y) On running the program, it is showing ValueError: bad input shape (2500, 8). How do i convert Y_train[2500,8] to Y_train[2500,1] ? 回答1: OP is using a one hot encoder so the fit function won't work with the array @Ishant Mrinal recommends this Y_train = np.argmax(Y_train, axis=1) That will allow

How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

阅读更多关于 How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK

问题 I have a small corpus and I want to calculate the accuracy of naive Bayes classifier using 10-fold cross validation, how can do it. 回答1: Your options are to either set this up yourself or use something like NLTK-Trainer since NLTK doesn't directly support cross-validation for machine learning algorithms. I'd recommend probably just using another module to do this for you but if you really want to write your own code you could do something like the following. Supposing you want 10-fold , you

How can accuracy differs between one_hot_encode and count_vectorizer for the same dataset?

阅读更多关于 How can accuracy differs between one_hot_encode and count_vectorizer for the same dataset?

问题 onehot_enc, BernoulliNB: Here, I have used two different files for reviews and labels and I've used "train_test_split" to randomly split the data into 80% train data and 20% test data. reviews.txt: Colors & clarity is superb Sadly the picture is not nearly as clear or bright as my 40 inch Samsung The picture is clear and beautiful Picture is not clear labels.txt: positive negative positive negative My Code: from sklearn.preprocessing import MultiLabelBinarizer from sklearn.model_selection

How to find out the accuracy?

阅读更多关于 How to find out the accuracy?

问题 I've wondered if there is a function in sklearn which corresponds to the accuracy(difference between actual and predicted data) and how to print it out? from sklearn import datasets iris = datasets.load_iris() from sklearn.naive_bayes import GaussianNB naive_classifier= GaussianNB() y =naive_classifier.fit(iris.data, iris.target).predict(iris.data) pr=naive_classifier.predict(iris.data) 回答1: Most classifiers in scikit have an inbuilt score() function, in which you can input your X_test and y