weka

Weka - binary classification giving polarized/biased results

女生的网名这么多〃 提交于 2019-12-14 03:57:16
问题 Let me say, first up, that I'm a WEKA newbie. I'm using WEKA for a binary classification problem where certain metrics are being used to get a yes/no answer for the instances. To exemplify the issue, here's the confusion matrix I got for a set with 288 instances, with 190 'yes' and 98 'no' values using BayesNet: a b <-- classified as 190 0 | a = yes 98 0 | b = no This absolute separation is the case with some other classifiers as well, but not with all of them. That said, even if classifiers

Weka - Classifier returns the same distribution for any input

情到浓时终转凉″ 提交于 2019-12-14 03:30:04
问题 I'm trying to build a naive bayes classifier for classifying text between two classes. Everything works great in the GUI explorer, but when I try to recreate it in code, I get the same output no matter what input I try to classify. Within the code, I get the same evaluation metrics I get within the GUI (81% accuracy), but whenever I try to create a new instance and classify that, I get the same distributions for both classes no matter what input I use. Below is my code - its in scala, but is

libsvm class not found in weka

帅比萌擦擦* 提交于 2019-12-13 19:42:44
问题 I installed LibSVM in weka from package manager and it is successfully installed. But when I am running the following command- java -cp ./weka.jar weka.classifiers.meta.FilteredClassifier -F weka.filters.unsupervised.attribute.RemoveType -W weka.classifiers.functions.LibSVM -t training.arff -no-cv -T testing.arff -v -o it Shows that Can't find class called: weka.classifiers.functions.LibSVM But other classifiers like Naive Bayes is working. Why it's not finding the class. I am using mac. 回答1:

weka batch filtering StringToWordVector

大憨熊 提交于 2019-12-13 17:30:15
问题 I'm trying to use Weka for text classification. I have two ARFF files: One for the training set (example of row in data): "mouse",no,no,no,no,no,yes,no and another one for test set (example of row in data:) "cat",?,?,?,?,?,?,? They have the same attribute declaration. But if I use batch filtering it tells me "Input file formats differ". Why? Here is the command that I use: C:\Programmi\Weka-3-6>java -cp C:\Programmi\Weka-3-6\weka.jar weka.filters.unsupervised.attribute.StringToWordVector -b

A discrepancy in computing nearest neighbours between R and Java + WEKA

此生再无相见时 提交于 2019-12-13 16:09:18
问题 I am in the process of debugging a library and another implementation which involves computing k-nearest neighbours. I am framing the question with an example which I am having difficulty to understand. First I will explain demonstrate the thing with a toy example, then show the output which will lead to the question. Task The demo here reads a csv file having 10 number of 2-dimensional datapoints. The task is to find the distance of all the datapoints from the first datapoint, and list all

Does anyone know how to generate AUC/Roc Area based on the predition?

有些话、适合烂在心里 提交于 2019-12-13 12:17:44
问题 I know the AUC/ROC area (http://weka.wikispaces.com/Area+under+the+curve) in weka is based on the e Mann Whitney statistic (http://en.wikipedia.org/wiki/Mann-Whitney_U) But my doubt is, if I've got 10 labeled instances (Y or N, binary target attribute), by applying an algorithm (i.e. J48) onto the dataset, then there are 10 predicted labels on these 10 instances. Then what exactly should I use to calculate the AUC_Y, AUC_N, and AUC_Avg? Use the prediction's ranked label Y and N or the actual

Invoke Weka tool inside PHP application

会有一股神秘感。 提交于 2019-12-13 09:23:52
问题 I am developing a web application to analyse the effect of a persons sleep habits on his health and performance..Can anyone help me about how to integrate weka tool into my application..for the analysis of data?? 回答1: You may want to take a look at shell_exec (or exec, system, passthru) For a summary of the differences for these functions, this answer may help you: https://stackoverflow.com/a/20072886/3052648 来源: https://stackoverflow.com/questions/31717468/invoke-weka-tool-inside-php

Randomforest classification weka

ぃ、小莉子 提交于 2019-12-13 04:35:12
问题 The attributes have been saved in 11 columns in csv file. If the order of columns change, Do Randomforest & RandomTree could give different accuracy in each time? 回答1: Ordering of the features does not affect any of classifiers I know (except those which are specially designed to do so - like specialistic classifiers for time series and other temporal features), no matter if it is Neural Network, SVM, RandomForest, RandomTree or NaiveBayes - it is just a numerical simplification, as arrays

Replace missing values with mean (Weka)

孤街醉人 提交于 2019-12-13 02:44:31
问题 in Weka there is a filter called "ReplaceMissingValues" that permit to replace all missing values in a dataset using the mean of each attribute. I'd like to replace missing values, for a certain attribute, using the mean of values that belong to a certain class. For example in a binary dataset I think that is more correct to replace a missing value for an attribute in record that belong to the positive class using the mean calculated with only the records that belong to the positive class. So

Weka machine learning:how to interprete Naive Bayes classifier?

*爱你&永不变心* 提交于 2019-12-12 18:27:50
问题 I am using the explorer feature for classification. My .arff data file has 10 features of numeric and binary values; (only the ID of instances is nominal).I have abt 16 instances. The class to predict is Yes/No.i have used Naive bayes but i cantnot interpret the results,,does anyone know how to interpret results from naive Bayes classification? 回答1: Naive Bayes doesn't select any important features. As you mentioned, the result of the training of a Naive Bayes classifier is the mean and