weka

Classifying unlabelled data in Weka

邮差的信 提交于 2019-12-23 20:47:58
问题 I'm currently using various classifiers in Weka. My testing data is labelled, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2,3,4,5,6,7} @attribute FirstPurchase numeric @attribute LastPurchase numeric @attribute responded {1,0} @data 4,200210,200601,0 5,200301,200601,1 6,200411,200601,0 5,199609,200603,0 6,200310,200512,1 ... The last value per row is the class element, i.e. responded. But if I try unlabelled test data, e.g.: @relation bmwreponses @attribute IncomeBracket {0,1,2

Java, Weka: NaiveBayesUpdateable: Cannot handle numeric class

左心房为你撑大大i 提交于 2019-12-23 20:14:03
问题 I am trying to use NaiveBayesUpdateable classifier from Weka. My data contains both nominal and numeric attributes: @relation cars @attribute country {FR, UK, ...} @attribute city {London, Paris, ...} @attribute car_make {Toyota, BMW, ...} @attribute price numeric %% car price @attribute sales numeric %% number of cars sold I need to predict the number of sales (numeric!) based on other attributes. When I run: // Train classifier ArffLoader loader = new ArffLoader(); loader.setFile(new File

How to use weights in Weka

随声附和 提交于 2019-12-23 10:03:24
问题 I need your help regarding weights in Weka. I am running some experiments on large scale of data: I am translating the data into instances and use different classifiers in order to study. Now I want to examine how entitling weights to instances effects the studying- sometimes I want to entitle an instance with a weight and sometimes not. My question is: What is the range of the weights possible? Does the effect of the weight differs from classifier to classifier? Is there a default weight (I

Support Vector Machine on R and WEKA

可紊 提交于 2019-12-23 03:24:24
问题 My data generated strange results with svm on R from the e1071 package, so I tried to check if the R svm can generate same result as WEKA (or python), since I've been using WEKA in the past. I googled the question and found one that has the exact same confusion with me but without an answer. This is the question. So I hope that I could get an answer here. To make things easier, I'm also using the iris data set, and train a model (SMO in WEKA, and svm from R package e1071) using the whole iris

Design pattern to convert tree-rules from Weka into SQL query

别等时光非礼了梦想. 提交于 2019-12-23 03:17:33
问题 I have some output from Weka that looks like this: fac_a < 64 | fac_d < 71.5 | | fac_a < 49.5 | | | fac_d < 23.5 : 19.44 (13/43.71) [13/77.47] | | | fac_d >= 23.5 : 24.25 (32/23.65) [16/49.15] | | fac_a >= 49.5 : 30.8 (10/17.68) [5/22.44] | fac_d >= 71.5 : 33.6 (25/53.05) [15/47.35] fac_a >= 64 | fac_d < 83.5 | | fac_a < 91 | | | fac_e < 93.5 | | | | fac_d < 45 : 31.9 (16/23.25) [3/64.14] | | | | fac_d >= 45 | | | | | fac_e < 21.5 : 44.1 (5/16.58) [2/21.39] | | | | | fac_e >= 21.5 | | | | | |

How to change attribute type to String (WEKA - CSV to ARFF)

可紊 提交于 2019-12-23 02:57:19
问题 I'm trying to make an SMS SPAM classifier using the WEKA library. I have a CSV file with "label" and "text" headings. When I use the code below, it creates an ARFF file with two attributes: @attribute label {ham,spam} @attribute text {'Go until jurong point','Ok lar...', etc.} Currently, it seems that the text attribute is formatted as a nominal attribute with each message's text as a value. But I need the text attribute to be a String attribute, not a list of all of the text from all

Export a SQL database into a CSV file and use it with WEKA

心已入冬 提交于 2019-12-23 02:48:19
问题 How can I export a query result from a .sql database into a .csv file? I tried with SELECT * FROM players INTO OUTFILE 'players.csv' FIELDS TERMINATED BY ',' LINES TERMINATED BY ';';` and my .csv file is something like: p1,1,2,3 p2,1,4,5 But they are not in saparated columns, all are in 1 column. I tried to create a .csv file by myself just to try WEKA, something like: p1 1 2 3 p2 1 4 5 But WEKA recognizes p1 1 2 3 as a single attribute. So: how can I export correctly a table from a sql db to

Weka predictive value of attributes

走远了吗. 提交于 2019-12-23 02:30:54
问题 I've got a question about finding the predictive value of certain attributes. In my question it is suggested that I transform my attributes to binary classes and then apply "decision stump" to find out the predictive value of each attribute. How do I do this? I checked out this question But that's not really what I mean. Thanks in advance, Rope. 回答1: You can find the most predictive attributes using the methods found under the Select Attributes tab in Weka's Explorer . 回答2: Yeah, the Select

What is the Stacking Algorithm in Weka? How it actually is working?

风格不统一 提交于 2019-12-22 11:59:07
问题 Is the result of Base classifiers are being selected by voting system & then what actually the Meta classifier is getting as it's input,whole classifier or just the miss-classified ones ? It would be helpful if the whole mechanism can be explained with a simple example like this link Majority vote algorithm in Weka.classifiers.meta.vote Thanks in advance. 回答1: Consider an ensemble of n members. Each of these members are trained on a given set of training data. The ensemble members may share

SMOTE oversampling and cross-validation

泪湿孤枕 提交于 2019-12-22 06:28:47
问题 I am working on a binary classification problem in Weka with a highly imbalanced data set (90% in one category and 10% in the other). I first applied SMOTE (http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume16/chawla02a-html/node6.html) to the entire data set to even out the categories and then performed 10-fold cross-validation over the newly obtained data. I found (overly?) optimistic results with F1 around 90%. Is this due to oversampling? Is it bad practice to perform cross-validation