how to edit weka configurations to find “1”

大城市里の小女人 提交于 2020-01-06 08:31:31

问题


I have an arff table with bool results.

Most of the lines end with "0" (like 95%). But the "0" don't interesting me. i want weka to find lines that end with "1".

But unfortunately, most of the algorithms just select "0" all of the time. That don't help to me at all.

How to make weka reach "1" only? (If it possible)?


回答1:


I think you are describing classical class imbalance problem . That is, almost every machine learning algorithm is designed to look for best accuracy. In your case if it assigns 0 each time it yields 95% accurancy and that is the best what it can do. (for more info google unbalanced classes, or class imbalance). However in cases like this the minority class is of greater interest.

Few quick solutions are: upsample class 1 or downsample class 2, or combine both in order to get balanced dataset for training - you can use WEKA SpreadSubsample for that. You can also have a look at SMOTE filter and MetaCost classifier.

If you are for some reason interested in accuracy you have to test classifier on original distribution so use SpreadSubsample as filtered classifier. However as you may already noticed, if you are interested in minority class, accuracy is not very reliable indicator of model performance. So have a look at class recall, ROC curve and AUC. Great article about ROC is here http://www.hpl.hp.com/techreports/2003/HPL-2003-4.pdf

Good luck



来源:https://stackoverflow.com/questions/22999500/how-to-edit-weka-configurations-to-find-1

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!