Weka UI and API code in Java gives different results

非 Y 不嫁゛ 提交于 2019-12-11 03:23:02

问题


I am new to Weka.

I am trying to run WEKA using API's and have found out that the results from the WEKA GUI does not match to the one produced by the Java code.

I am trying to run a RandomForest Algorithm by providing TrainingSet and Test Set.

Here is the code snippet:

            DataSource ds = new DataSource(trainingFile);

            Instances insts = ds.getDataSet();

            insts.setClassIndex(insts.numAttributes() - 1);

            Classifier cl = new RandomForest();
            RandomForest rf = (RandomForest)cl;
       //     rf.setOptions(options);
         //   rf.setNumExecutionSlots(1);
            rf.setNumFeatures(5);
            rf.setSeed(1);
            rf.setNumExecutionSlots(1);                  

            Remove remove = new Remove();
            int[] attrs = WekaCustomisation.convertIntegers(attrList);

            remove.setAttributeIndicesArray(attrs);
            remove.setInvertSelection(true);                

            remove.setInputFormat(insts);
            insts = weka.filters.Filter.useFilter(insts, remove);

            insts.setClassIndex(insts.numAttributes() - 1);            


            weka.core.Instances train = new weka.core.Instances(insts, 0, insts.numInstances());          


            cl.buildClassifier(train);

         weka.core.converters.ConverterUtils.DataSource ds2 = new weka.core.converters.ConverterUtils.DataSource(testFile);

            weka.core.Instances instsTest = ds2.getDataSet();
            remove.setInputFormat(instsTest);
            instsTest = weka.filters.Filter.useFilter(instsTest, remove);
            instsTest.setClassIndex(instsTest.numAttributes() - 1);                

            Instances testInstances = new Instances(instsTest);
            int numCorrect = 0;

            weka.classifiers.Evaluation eval = new weka.classifiers.Evaluation(train);
            eval.evaluateModel(cl, testInstances);
            System.out.println(eval.toSummaryString());
            out.write(eval.toSummaryString());
            double roc = eval.areaUnderROC(0);

The confusion matrix produced by the WEKA GUI and this code differs. What am I missing here.


回答1:


At first check if the parameters and filterings executed in the Weka GUI are the same you are doing in the code. (take a look at the log generated in the GUI)

A second possilibty is the random component that the Random Forest models have in its creation structure (selecting random features in the dataset for each decision tree, see here). So, during the training phase different models are generated to the same train dataset and when you evaluate with the test you get different results.



来源:https://stackoverflow.com/questions/11872974/weka-ui-and-api-code-in-java-gives-different-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!