Naive Bayes: the within-class variance in each feature of TRAINING must be positive

妖精的绣舞 提交于 2019-12-09 17:44:32

问题


When trying to fit Naive Bayes:

    training_data = sample; % 
    target_class = K8;
 # train model
 nb = NaiveBayes.fit(training_data, target_class);

 # prediction
 y = nb.predict(cluster3);

I get an error:

??? Error using ==> NaiveBayes.fit>gaussianFit at 535
The within-class variance in each feature of TRAINING
must be positive. The within-class variance in feature
2 5 6 in class normal. are not positive.

Error in ==> NaiveBayes.fit at 498
            obj = gaussianFit(obj, training, gindex);

Can anyone shed light on this and how to solve it? Note that I have read a similar post here but I am not sure what to do? It seems as if its trying to fit based on columns rather than rows, the class variance should be based on the probability of each row belonging to a specific class. If I delete those columns then it works but obviously this isnt what I want to do.


回答1:


Assuming that there is no bug anywhere in your code (or NaiveBayes code from mathworks), and again assuming that your training_data is in the form of NxD where there are N observations and D features, then columns 2, 5, and 6 are completely zero for at least a single class. This can happen if you have relatively small training data and high number of classes, in which a single class may be represented by a few observations. Since NaiveBayes by default treats all features as part of a normal distribution, it cannot work with a column that has zero variance for all features related to a single class. In other words, there is no way for NaiveBayes to find the parameters of the probability distribution by fitting a normal distribution to the features of that specific class (note: the default for distribution is normal).

Take a look at the nature of your features. If they seem to not follow a normal distribution within each class, then normal is not the option you want to use. Maybe your data is closer to a multinomial model mn:

nb = NaiveBayes.fit(training_data, target_class, 'Distribution', 'mn');


来源:https://stackoverflow.com/questions/13427664/naive-bayes-the-within-class-variance-in-each-feature-of-training-must-be-posit

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!