Same Instances header ( arff ) for all my database queries

后端 未结 2 915
长情又很酷
长情又很酷 2021-01-28 14:07

I am using InstanceQuery , SQL queries, to construct my Instances. But my query results does not come in the same order always as it is normal in SQL. Beacuse of this Instance

相关标签:
2条回答
  • 2021-01-28 14:33

    I solved a similar problem with the Add filter that allows adding attributes to Instances. You need to add a correct Attibute with proper list of values to both datasets (in my case - to test dataset only):

    Load train and test data:

    /* "train" contains labels and data */
    /* "test" contains data only */
    CSVLoader csvLoader = new CSVLoader();
    csvLoader.setFile(new File(trainFile));
    Instances training = csvLoader.getDataSet();
    csvLoader.reset();
    csvLoader.setFile(new File(predictFile));
    Instances test = csvLoader.getDataSet();
    

    Set a new attribute with Add filter:

    Add add = new Add();
    /* the name of the attribute must be the same as in "train"*/
    add.setAttributeName(training.attribute(0).name());
    /* getValues returns a String with comma-separated values of the attribute */
    add.setNominalLabels(getValues(training.attribute(0)));
    /* put the new attribute to the 1st position, the same as in "train"*/
    add.setAttributeIndex("1");
    add.setInputFormat(test);
    /* result - a compatible with "train" dataset */
    test = Filter.useFilter(test, add);
    

    As a result, the headers of both "train" and "test" are the same (compatible for Weka machine learning)

    0 讨论(0)
  • 2021-01-28 14:42

    I tried various approaches to my problem. But it seems that weka internal API does not allow solution to this problem right now. I modified weka.core.Instances append command line code for my purposes. This code is also given in this answer

    According to this, here is my solution. I created a SampleWithKnownHeader.arff file , which contains correct header values. I read this file with following code.

    public static Instances getSampleInstances() {
        Instances data = null;
        try {
            BufferedReader reader = new BufferedReader(new FileReader(
                    "datas\\SampleWithKnownHeader.arff"));
            data = new Instances(reader);
            reader.close();
            // setting class attribute
            data.setClassIndex(data.numAttributes() - 1);
        }
        catch (Exception e) {
            throw new RuntimeException(e);
        } 
        return data;
    
    }
    

    After that , I use following code to create instances. I had to use StringBuilder and string values of instance, then I save corresponding string to file.

    public static void main(String[] args) {
    
        Instances SampleInstance = MyUtilsForWeka.getSampleInstances();
    
        DataSource source1 = new DataSource(SampleInstance);
    
        Instances data2 = InstancesFromDatabase
                .getInstanceDataFromDatabase(DatabaseQueries.WEKALIST_QUESTION1);
    
        MyUtilsForWeka.saveInstancesToFile(data2, "fromDatabase.arff");
    
        DataSource source2 = new DataSource(data2);
    
        Instances structure1;
        Instances structure2;
        StringBuilder sb = new StringBuilder();
        try {
            structure1 = source1.getStructure();
            sb.append(structure1);
            structure2 = source2.getStructure();
            while (source2.hasMoreElements(structure2)) {
                String elementAsString = source2.nextElement(structure2)
                        .toString();
                sb.append(elementAsString);
                sb.append("\n");
    
            }
    
        } catch (Exception ex) {
            throw new RuntimeException(ex);
        }
    
        MyUtilsForWeka.saveInstancesToFile(sb.toString(), "combined.arff");
    
    }
    

    My save instances to file code is as below.

    public static void saveInstancesToFile(String contents,String filename) {
    
         FileWriter fstream;
        try {
            fstream = new FileWriter(filename);
          BufferedWriter out = new BufferedWriter(fstream);
          out.write(contents);
          out.close();
        } catch (Exception ex) {
            throw new RuntimeException(ex);
        }
    

    This solves my problem but I wonder if more elegant solution exists.

    0 讨论(0)
提交回复
热议问题