Remove Missing Values in Weka

柔情痞子 提交于 2019-12-11 00:45:56

问题


I'm using a dataset in Weka for classfication that includes missing values. As far as I understood, Weka replaces them automatically with the Modes or Mean of the training data (using the filter unsupervised/attribute/ReplaceMissingValues) when using a classifier like NaiveBayes.

I would like to try removing them, to see how this effects the quality of the classifier. Is there a filter to do that?


回答1:


My approach is not the perfect one because IF you have more than 5 or 6 attributes then it becomes quite cumbersome to apply but I can suggest that MultiFilter should be used for this purpose if only a few attributes have missing values.

If you have missing values in 2 attributes then you'll use RemoveWithValues 2 times in a MultiFilter.

  1. Load your data in Weka Explorer
  2. Select MultiFilter from the Filter area
  3. Click on MultiFilter and Add RemoveWithValues
  4. Then configure each RemoveWithValues filter with the attribute index and select True in matchMissingValues
  5. Save the filter settings and click Apply in Explorer.



回答2:


Use the removeIf() method on weka.core.Instances using the method reference from weka.core.Instance for the hasMissingValue method, which returns a boolean if a given Instance has any missing values.

Instances dataset = source.getDataSet() // for some source
dataset.removeIf(Instance::hasMissingValue);


来源:https://stackoverflow.com/questions/18230939/remove-missing-values-in-weka

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!