libsvm | 易学教程

Loading a Dataset for Linear SVM Classification from a CSV file

阅读更多关于 Loading a Dataset for Linear SVM Classification from a CSV file

问题 I have a csv file below called train.csv: 25.3, 12.4, 2.35, 4.89, 1, 2.35, 5.65, 7, 6.24, 5.52, M 20, 15.34, 8.55, 12.43, 23.5, 3, 7.6, 8.11, 4.23, 9.56, B 4.5, 2.5, 2, 5, 10, 15, 20.25, 43, 9.55, 10.34, B 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5, M I am trying to get this dataset be separated and classified as the following (This is the output I want): [[25.3, 12.4, 2.35, 4.89. 1, 2.35, 5.65, 7, 6.24, 5.52], [20, 15.34, 8.55, 12.43, 23.5, 3, 7.6, 8.11, 4.23, 9.56], [4.5, 2.5, 2, 5,

SVM classification with always high precision

阅读更多关于 SVM classification with always high precision

问题 I have a binary classification problem and I'm trying to get precision-recall curve for my classifier. I use libsvm with RBF kernel and probability estimate option. To get the curve I'm changing decision threshold from 0 to 1 with steps of 0.1. But on every run, I get high precision even if recall decreases with increasing threshold. My false positive rate seems always low compared to true positives. My results are these: Threshold: 0.1 TOTAL TP:393, FP:1, FN: 49 Precision:0.997462, Recall: 0

SVM classification with always high precision

阅读更多关于 SVM classification with always high precision

libsvm交叉验证与网格搜索（参数选择）

阅读更多关于 libsvm交叉验证与网格搜索（参数选择）

首先说交叉验证。交叉验证（Cross validation）是一种评估统计分析、机器学习算法对独立于训练数据的数据集的泛化能力（generalize），能够避免过拟合问题。交叉验证一般要尽量满足： 1）训练集的比例要足够多，一般大于一半 2）训练集和测试集要均匀抽样交叉验证主要分成以下几类： 1）Double cross-validation Double cross-validation也称2-fold cross-validation(2-CV)，作法是将数据集分成两个相等大小的子集，进行两回合的分类器训练。在第一回合中，一个子集作为训练集，另一个作为测试集；在第二回合中，则将训练集与测试集对换后，再次训练分类器，而其中我们比较关心的是两次测试集的识别率。不过在实际中2-CV并不常用，主要原因是训练集样本数太少，通常不足以代表母体样本的分布，导致测试阶段识别率容易出现明显落差。此外，2-CV中子集的变异度大，往往无法达到「实验过程必须可以被复制」的要求。 2）k-folder cross-validation(k折交叉验证) K-fold cross-validation (k-CV)则是Double cross-validation的延伸，做法是将数据集分成k个子集，每个子集均做一次测试集，其余的作为训练集。k-CV交叉验证重复k次

How to create training data for libsvm (as an svm_node struct)

阅读更多关于 How to create training data for libsvm (as an svm_node struct)

问题 I am trying to train an svm for a simple xor problem programmatically using libsvm to understand how the library works. The problem (i think) seems to be that i construct svm_node incorrectly; maybe i have trouble understanding the whole pointers to pointers thing. Could anybody help with this? I first construct a matrix for the xor problem then try to assign values from the matrix to svm_node (i am using 2 steps here because my real data will be in matrix format). When testing the model i

How to create training data for libsvm (as an svm_node struct)

阅读更多关于 How to create training data for libsvm (as an svm_node struct)

Implementation of SVM-RFE Algorithm in R

阅读更多关于 Implementation of SVM-RFE Algorithm in R

问题 I'm using the R code for the implementation of SVM-RFE Algorithm from this source http://www.uccor.edu.ar/paginas/seminarios/Software/SVM_RFE_R_implementation.pdf but I made a small modification so that the r code uses the gnum library. The code is the following: svmrfeFeatureRanking = function(x,y){ n = ncol(x) survivingFeaturesIndexes = seq(1:n) featureRankedList = vector(length=n) rankedFeatureIndex = n while(length(survivingFeaturesIndexes)>0){ #train the support vector machine svmModel =

Making an input text file as a training data set in libsvm

阅读更多关于 Making an input text file as a training data set in libsvm

问题 I am working on hydraulic simulation using epanet, and I have got various leaky and normal flow, pressure values of nodes and links. I want to train these output values using libsvm. I have got the output (hydraulic simulation) in different format. My question is whether it is possible to add that output as training input to the libsvm. 回答1: If your output format can be stored in a generic CSV format, you can use csv2libsvm to convert the data into libsvm format. The git repo can be found

One class SVM probability estimates and what is the different between one class SVM and clustering

阅读更多关于 One class SVM probability estimates and what is the different between one class SVM and clustering

问题 I have a set of images. I would like to learn a one class SVM (OC-SVM) to model the distribution of a particular class (positive) as I dont have enough examples to represent the other classes (negative). What I understood about OC-SVM is that it tries to separate the data from the origin or in other words it tries to learn a hyper sphere to fit the one class data. My questions are, If I want to use the output of the OC-SVM as a probability estimate, how can I do it? What is the difference

One class SVM probability estimates and what is the different between one class SVM and clustering

阅读更多关于 One class SVM probability estimates and what is the different between one class SVM and clustering