weka | 易学教程

Classifying unlabelled data in Weka

阅读更多关于 Classifying unlabelled data in Weka

问题 I'm currently using various classifiers in Weka. My testing data is labelled, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2,3,4,5,6,7} @attribute FirstPurchase numeric @attribute LastPurchase numeric @attribute responded {1,0} @data 4,200210,200601,0 5,200301,200601,1 6,200411,200601,0 5,199609,200603,0 6,200310,200512,1 ... The last value per row is the class element, i.e. responded. But if I try unlabelled test data, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2

Java, Weka: NaiveBayesUpdateable: Cannot handle numeric class

阅读更多关于 Java, Weka: NaiveBayesUpdateable: Cannot handle numeric class

问题 I am trying to use NaiveBayesUpdateable classifier from Weka. My data contains both nominal and numeric attributes: @relation cars @attribute country {FR, UK, ...} @attribute city {London, Paris, ...} @attribute car_make {Toyota, BMW, ...} @attribute price numeric %% car price @attribute sales numeric %% number of cars sold I need to predict the number of sales (numeric!) based on other attributes. When I run: // Train classifier ArffLoader loader = new ArffLoader(); loader.setFile(new File

How to use weights in Weka

阅读更多关于 How to use weights in Weka

问题 I need your help regarding weights in Weka. I am running some experiments on large scale of data: I am translating the data into instances and use different classifiers in order to study. Now I want to examine how entitling weights to instances effects the studying- sometimes I want to entitle an instance with a weight and sometimes not. My question is: What is the range of the weights possible? Does the effect of the weight differs from classifier to classifier? Is there a default weight (I

Support Vector Machine on R and WEKA

阅读更多关于 Support Vector Machine on R and WEKA

问题 My data generated strange results with svm on R from the e1071 package, so I tried to check if the R svm can generate same result as WEKA (or python), since I've been using WEKA in the past. I googled the question and found one that has the exact same confusion with me but without an answer. This is the question. So I hope that I could get an answer here. To make things easier, I'm also using the iris data set, and train a model (SMO in WEKA, and svm from R package e1071) using the whole iris

Design pattern to convert tree-rules from Weka into SQL query

阅读更多关于 Design pattern to convert tree-rules from Weka into SQL query

How to change attribute type to String (WEKA - CSV to ARFF)

阅读更多关于 How to change attribute type to String (WEKA - CSV to ARFF)

问题 I'm trying to make an SMS SPAM classifier using the WEKA library. I have a CSV file with "label" and "text" headings. When I use the code below, it creates an ARFF file with two attributes: @attribute label {ham,spam} @attribute text {'Go until jurong point','Ok lar...', etc.} Currently, it seems that the text attribute is formatted as a nominal attribute with each message's text as a value. But I need the text attribute to be a String attribute, not a list of all of the text from all

Export a SQL database into a CSV file and use it with WEKA

阅读更多关于 Export a SQL database into a CSV file and use it with WEKA

问题 How can I export a query result from a .sql database into a .csv file? I tried with SELECT * FROM players INTO OUTFILE 'players.csv' FIELDS TERMINATED BY ',' LINES TERMINATED BY ';';` and my .csv file is something like: p1,1,2,3 p2,1,4,5 But they are not in saparated columns, all are in 1 column. I tried to create a .csv file by myself just to try WEKA, something like: p1 1 2 3 p2 1 4 5 But WEKA recognizes p1 1 2 3 as a single attribute. So: how can I export correctly a table from a sql db to

Weka predictive value of attributes

阅读更多关于 Weka predictive value of attributes

问题 I've got a question about finding the predictive value of certain attributes. In my question it is suggested that I transform my attributes to binary classes and then apply "decision stump" to find out the predictive value of each attribute. How do I do this? I checked out this question But that's not really what I mean. Thanks in advance, Rope. 回答1: You can find the most predictive attributes using the methods found under the Select Attributes tab in Weka's Explorer . 回答2: Yeah, the Select

What is the Stacking Algorithm in Weka? How it actually is working?

阅读更多关于 What is the Stacking Algorithm in Weka? How it actually is working?

问题 Is the result of Base classifiers are being selected by voting system & then what actually the Meta classifier is getting as it's input,whole classifier or just the miss-classified ones ? It would be helpful if the whole mechanism can be explained with a simple example like this link Majority vote algorithm in Weka.classifiers.meta.vote Thanks in advance. 回答1: Consider an ensemble of n members. Each of these members are trained on a given set of training data. The ensemble members may share

SMOTE oversampling and cross-validation

阅读更多关于 SMOTE oversampling and cross-validation

问题 I am working on a binary classification problem in Weka with a highly imbalanced data set (90% in one category and 10% in the other). I first applied SMOTE (http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume16/chawla02a-html/node6.html) to the entire data set to even out the categories and then performed 10-fold cross-validation over the newly obtained data. I found (overly?) optimistic results with F1 around 90%. Is this due to oversampling? Is it bad practice to perform cross-validation