weka | 易学教程

Why am I getting a 1.000 ROC area value even when I don't have 100% of accuracy

阅读更多关于 Why am I getting a 1.000 ROC area value even when I don't have 100% of accuracy

问题 I am using Weka as a classifier, and it has worked great for me so far. However, in my last test, I got a 1.000 ROC area value (which, if i remember correctly, represents a perfect classification) without having 100% of accuracy, as can be seen in the Confusion Matrix in the Figure. My question is: Am I interpreting the results incorrectly or am I getting wrong results (maybe the classifier I am using is badly programmed, although I don't think it's likely)? Classification output Thank You!

WEKA - filtering out classes in a MultiClassClassifer

阅读更多关于 WEKA - filtering out classes in a MultiClassClassifer

问题 I have trained a MultiClassClassifier (tested, working) and saved it somewhere on my hard drive. Now I want to make predictions for a new sample I got. I load my application and my classifier auto loads with it. I have narrowed down the search to five 5 possible classes already for the sample, outside the classification process. This means, I know k classes, that can easily be avoided in the classification. Is it possible to filter a MultiClassClassifier (filter out all unwanted classes)

Predicting the “no class” / unrecognised class in Weka Machine Learning

阅读更多关于 Predicting the “no class” / unrecognised class in Weka Machine Learning

问题 I am using Weka 3.7 to classify text documents based on their content. I have a set of text files in folders and they all belong to a certain category. Category A: 100 txt files Category B: 100 txt files ... Category X: 100 txt files I want to predict if a document falls into one of the categories A-X, OR if it falls in the category UNRECOGNISED (for all other documents). I am getting the total set of Instances programatically like this: private Instances getTotalSet(){ ArrayList<Attribute>

run same java class by different shell at the same time

阅读更多关于 run same java class by different shell at the same time

问题 It may be a dumb question but i just want to sure. I want to run same java class(weka text classifier) from different shell script at the same time with different data set. My idea is getting little confusing about this. is that class will behave like multi-thread? if this is the case, is weka classifiers thread-safe? 回答1: Running multiple instances of Weka classifiers, from different shells, runs them as different processes. This is safe , and their execution would not interfere with each

Weka class cannot be initialized: InvocationTargetException

阅读更多关于 Weka class cannot be initialized: InvocationTargetException

问题 This is my first time using weka , I am sorry if my question seems naive. But I was really stuck by this problem. I am using weka in my own java project in eclipse . I have successfully import weka.jar with attached wekasource.jar . But when I ran the program, all the weka class always failed to be initialized(attribute, Fastvector etc.). All the exceptions are the same: InvocationTargetException I check the error stack where showed: java.lang.NoClassDefFoundError: weka/core/attribute

Why does k=1 in KNN give the best accuracy?

阅读更多关于 Why does k=1 in KNN give the best accuracy?

问题 I am using Weka IBk for text classificaiton. Each document basically is a short sentence. The training dataset contains 15,000 documents. While testing, I can see that k=1 gives the best accuracy? How can this be explained? 回答1: If you are querying your learner with the same dataset you have trained on with k=1, the output values should be perfect barring you have data with the same parameters that have different outcome values. Do some reading on overfitting as it applies to KNN learners. In

Cross Validation - Weka api

阅读更多关于 Cross Validation - Weka api

问题 How can I make a classification model by 10-fold cross-validation using Weka Api? I ask this, because each cross-validation's run a new classification model is created. Wich classification model should I use in my test data? Thank you!! 回答1: 10-fold cross validation is used to get an estimate of a classifier's accuracy should that classifier be constructed from all of the training data. It is used when it is felt that there is not enough data for an independent test set. This means that you

Executing Weka Classification in C# in Parallel

阅读更多关于 Executing Weka Classification in C# in Parallel

问题 I have asked a few broad questions about the operations of Weka and C# as well as WekaSharp, so I thought I would try to ask a more focused question to try to progress further on my own. As an example given from the weka site on executing weka from C# I was using I would like to run part of the calculation using parallel operations but am not sure how to code it here is the raw code: using System; using System.Collections.Generic; using System.Linq; using System.Text; using weka.classifiers

java.io.EOFException when reading weka trained model file

阅读更多关于 java.io.EOFException when reading weka trained model file

问题 I'm trying to load my weka trained model file to generate a prediction. But I get a error of java.io.eofexception when trying to do this. I'm sure this is got to do with my model file being not correctly formed. But I have used weka tool to create the model file and don't understand what's wrong. Code public Classifier loadModel() throws Exception { this.readConfFile(); Classifier classifier; FileInputStream fis = new FileInputStream( prop.getProperty("Output_Model_Dir") + "/best3.model");

how to use weka in keyphrase extraction from text arguments

阅读更多关于 how to use weka in keyphrase extraction from text arguments

问题 I am working on a project "key phrase extraction from text arguments" . For this I first did input cleaning and then detemined list of candidate phrases( in total around 300) using stanford parser(POS tagging). Then I computed feature value of each and every phrase. I followed these steps on each and every document in my dataset. Now how should I proceed i.e.., how to use WEKA to find keyphrases. How should I store phrases and feature values(TFXIDF) in weka . How to find efficiency of the