weka | 易学教程

Weka Linear regression ClassNotFoundException

阅读更多关于 Weka Linear regression ClassNotFoundException

问题 String filePath = new File("").getAbsolutePath(); DataSource source = new DataSource(filePath + "\\src\\data\\data.arff"); Instances dataset = source.getDataSet(); // set class dataset.setClassIndex(0); // build model **LinearRegression lr = new LinearRegression();** lr.buildClassifier(dataset); System.out.println(lr); Right after LinearRegression instantiation I get this error: Exception in thread "main" java.lang.NoClassDefFoundError: no/uib/cipr/matrix/Matrix at weka_prediction.Main

Weka - differences between Explorer and Experimenter outcomes

阅读更多关于 Weka - differences between Explorer and Experimenter outcomes

问题 I just wondered why is the % correctly classified differs from the Explorer and Experimenter aspects of Weka. I have checked to ensure I am employing 10-cross fold validation as well as all other paramaters! Anyone have any ideas? Thanks 回答1: I have the solution, as provided by Mark Hall, as I emailed him on the Weka Mail list. Here is the difference between Explorer and Experimenter: The Experimenter operates differently from the Explorer. The Explorer sums evaluation metrics over the folds

Remove Missing Values in Weka

阅读更多关于 Remove Missing Values in Weka

问题 I'm using a dataset in Weka for classfication that includes missing values. As far as I understood, Weka replaces them automatically with the Modes or Mean of the training data (using the filter unsupervised/attribute/ReplaceMissingValues ) when using a classifier like NaiveBayes. I would like to try removing them, to see how this effects the quality of the classifier. Is there a filter to do that? 回答1: My approach is not the perfect one because IF you have more than 5 or 6 attributes then it

Regarding RandomTree in Weka

阅读更多关于 Regarding RandomTree in Weka

问题 I was playing around with weka when I observed a minNum field in the RandomTree configuration. I read the description which said "The minimum total weight of the instances in a leaf". However, I couldn't really understand what it means. I played around with that number, and I realized that when I increase it, the size of the tree thus generated reduces. I couldn't correlate as to why this is happening. Any help/references will be appreciated. 回答1: This has to do with the minimum number of

Win7 系统中安装数据挖掘工作平台 Weka

阅读更多关于 Win7 系统中安装数据挖掘工作平台 Weka

1.我的环境操作系统：32位 Win7 旗舰版 Service Pack 1 2.下载WEKA OSC上WEKA的页面地址： http://www.oschina.net/p/weka 软件首页地址： http://www.cs.waikato.ac.nz/ml/weka/ 各版本Weka运行需要的Java版本，可以在这个页面找到： http://www.cs.waikato.ac.nz/ml/weka/requirements.html 该页面上的一张表格截图如下：进入下载页面： http://www.cs.waikato.ac.nz/ml/weka/downloading.html 我的电脑是个32位Win7系统，并且之前并没有安装过Java虚拟机因此我在下载时下载“Windows x86”分类中包括JavaVM1.7的包单击粗体的“here”链接后，会前往SourceForge的对应页面下载下载后的安装包文件名为weka-3-6-11jre.exe 3.安装Weka 双击weka-3-6-11jre.exe进入安装向导，单击“Next >”进入下一步接下来是用户协议，GNU GENERAL PUBLIC LICENSE Version 2 这个协议的信息可以在下面这个页面看到： Version2.0： http://www.gnu.org/licenses/gpl

Weka normalizing columns

阅读更多关于 Weka normalizing columns

问题 I have an ARFF file containing 14 numerical columns. I want to perform a normalization on each column separately, that is modifying the values from each colum to ( actual_value - min(this_column)) / (max(this_column) - min(this_column) ). Hence, all values from a column will be in the range [0, 1] . The min and max values from a column might differ from those of another column. How can I do this with Weka filters? Thanks 回答1: This can be done using weka.filters.unsupervised.attribute

Finding a correlation between variable and class variable

阅读更多关于 Finding a correlation between variable and class variable

问题 I have a dataset which contains 7 numerical attributes and one nominal which is the class variable. I was wondering how I can the best attribute that can be used to predict the class attribute. Would finding the largest information gain by each attribute be the solution? 回答1: So the problem you are asking about falls under the domain of feature selection, and more broadly, feature engineering. There is a lot of literature online regarding this, and there are definitely a lot of blogs

Which Weka and LibSVM .jar files to use in Java code for SVM classification

阅读更多关于 Which Weka and LibSVM .jar files to use in Java code for SVM classification

问题 If I use Weka Explorer to run some training data against testing data using SVM with a linear kernel, everything is fine. But I need to do this programmatically in my own Java and my current code looks like this: Instances train = new Instances (...); train.setClassIndex(train.numAttributes() - 1); Instances test = new Instances (...) + ClassificationType classificationType = ClassificationTypeDAO.get(6); LibSVM libsvm = new LibSVM(); String options = (classificationType.getParameters());

Unable to execute jar file despite having PATH and CLASSPATH set

阅读更多关于 Unable to execute jar file despite having PATH and CLASSPATH set

问题 My question is regarding including jar files in path. It has 2 parts. 1) I am trying to execute weka.jar jar file located in /home/andy/software/weka/weka.jar PATH variable points to this jar file (i.e. to /home/andy/software/weka/weka.jar) and so does CLASSPATH. However when I try to run the jar using java -jar weka.jar, I get an error "Unable to access jarfile weka.jar". Any ideas what is going on? I am on Ubuntu Linux. I looked around in SO and it seems like I am not doing anything that is

Simple text classification using naive bayes (weka) in java

阅读更多关于 Simple text classification using naive bayes (weka) in java

问题 I try to do text classification naive bayes weka libarary in my java code, but i think the result of the classification is not correct, i don't know what's the problem. I use arff file for the input. this is my training data: @relation hamspam @attribute text string @attribute class {spam,ham} @data 'good',ham 'good',ham 'very good',ham 'bad',spam 'very bad',spam 'very bad, very bad',spam 'good good bad',ham this is my testing_data: @relation test @attribute text string @attribute class {spam