weka | 易学教程

Using LIBSVM to predict authenticity of the user

阅读更多关于 Using LIBSVM to predict authenticity of the user

问题 I am planning on using LibSVM to predict user authenticity in web applications. (1) Collect Data on particular user behavior(eg. LogIn time, IP Address, Country etc.) (2) Use Collected Data to train an SVM (3) Use real time data to compare and generate an output on level of authenticity Can some one tell me how can I do such a thing with LibSVM? Can Weka be helpful in these types of problems? 回答1: The three steps you mention are an outline of the solution. In some more detail: Make sure you

Hadoop: Easy way to have object as output value without Writable interface

阅读更多关于 Hadoop: Easy way to have object as output value without Writable interface

问题 I am trying to exploit hadoop to train multiple models . My data are small enough to fit in memory so i want to have one model trained in every map task. My problem is that when i have finished training my model, i need to send it to the reducer. I am using Weka to train the model. I don't want to start looking how to implement the Writable interface in Weka classes, because it needs a lot of effort. I am looking for a simple way to do this. The Classifier class in Weka implements the

Create a new weka Instance

阅读更多关于 Create a new weka Instance

问题 I'm new in Weka, I'm triying to create new instances to be labeled with a previous trained MultilayerPerceptron , I did't know very much about how to create an instance so I got the first instance from my training data and then modified it by changing the atributes values: //Opening the model public boolean abrirModelo(String ruta) { try { clasificador = (MultilayerPerceptron) weka.core.SerializationHelper.read(ruta); return true; } catch (IOException e) { System.out.println("Fallo la lectura

How to print out the predicted class after cross-validation in WEKA

阅读更多关于 How to print out the predicted class after cross-validation in WEKA

问题 Once a 10-fold cross-validation is done with a classifier, how can I print out the prediced class of every instance and the distribution of these instances? J48 j48 = new J48(); Evaluation eval = new Evaluation(newData); eval.crossValidateModel(j48, newData, 10, new Random(1)); When I tried something similar to below, it said that the classifier is not built . for (int i=0; i<data.numInstances(); i++){ System.out.println(j48.distributionForInstance(newData.instance(i))); } What I'm trying to

Exact implementation of RandomForest in Weka 3.7

阅读更多关于 Exact implementation of RandomForest in Weka 3.7

问题 Having reviewed the original Breiman (2001) paper as well as some other board posts, I am slightly confused with the actual procedure used by WEKAs random forest implementation. None of the sources was sufficiently elaborate, many even contradict each other. How does it work in detail, which steps are carried out? My understanding till now: For each tree a bootstrap sample of the same size as the training data is created Only a random subset of the available features of defined size

weka stringToWordVector filter stringOptions

阅读更多关于 weka stringToWordVector filter stringOptions

问题 I'm trying to filter a dataset using weka's java API. I've successfully filtered the attributes I want with a stringToWordVector filter in Weka's GUI but I can't seem to do the same in my java code. I copied and pasted the auto-generated filtering parameters and posted them into my code but am continuing to get errors. Currently, my code looks like this: Instances newInsts = new Instances(this.instances); StringToWordVector stringFilter = new StringToWordVector(); stringFilter.setOptions(

Getting Xmeans clusterer output programmatically in Weka

阅读更多关于 Getting Xmeans clusterer output programmatically in Weka

问题 When using Kmeans in Weka, one can call getAssignments() on the resulting output of the model to get the cluster assignment for each given instance. Here's a (truncated) Jython example: >>>import weka.clusterers.SimpleKMeans as kmeans >>>kmeans.buildClusterer(data) >>>assignments = kmeans.getAssignments() >>>assignments >>>array('i',[14, 16, 0, 0, 0, 0, 16,...]) The index of each cluster number corresponds to the instance. So, instance 0 is in cluster 14, instance 1 is in cluster 16, and so

How to add LibSVM class to WEKA classpath on a Mac

阅读更多关于 How to add LibSVM class to WEKA classpath on a Mac

问题 I am running Max OS X 10.7 Lion and I want to use WEKA with LibSVM from command line. I get this error: Problem evaluating classifier: libsvm classes not in CLASSPATH! I found the LibSVM library here. I need to add it to my Java classpath so that WEKA can find it. The download contains several files, shown below. I don't know how to add them to my classpath for Java. I am attempting to use the LibSVM classifier in WEKA because it is preferable for me over SMO. I am also unsure if this means

Example for svm feature selection in R

阅读更多关于 Example for svm feature selection in R

问题 I'm trying to apply feature selection (e.g. recursive feature selection) in SVM, using the R package. I've installed Weka which supports feature selection in LibSVM but I haven't found any example for the syntax of SVM or anything similar. A short example would be of a great help. 回答1: The function rfe in the caret package performs recursive feature selection for various algorithms. Here's an example from the caret documentation: library(caret) data(BloodBrain, package="caret") x <- scale

Too many attributes for ARFF format in Weka

阅读更多关于 Too many attributes for ARFF format in Weka

问题 I am working with a data-set of dimension more than 10,000. To use Weka I need to convert text file into ARFF format, but since there are too many attributes even after using sparse ARFF format file size is too large. Is there any similar method as for data to avoid writing so many attribute identifier as in header of ARFF file. for example : @attribute A1 NUMERICAL @attribute A2 NUMERICAL ... ... @attribute A10000 NUMERICAL 回答1: I coded a script in AWK to format the following lines (in a TXT