implementing Bags of Words object recognition using VLFEAT

I am trying to implement a BOW object recognition code in matlab. The process is slightly complicated and I've had a lot of trouble finding proper documentation on the procedure. So could someone double check if my plan below makes sense? I'm using the VLSIFT library extensively here

Training:
1. Extract SIFT image descriptor with VLSIFT
2. Quantize the descriptors with k-means(vl_hikmeans)
3. Take quantized descriptors and create histogram(VL_HIKMEANSHIST)
4. Create SVM from histograms(VL_PEGASOS?)

I understand step 1-3, but I'm not quite sure if the function for SVM is correct. VL_PEGASOS takes the following:

W = VL_PEGASOS(X, Y, LAMBDA)

How exactly do I use this function with the histogram that I create?

Finally during the recognition stage, how do I match the image with a class defined by the SVM?

Did you look at their Caltech 101 example code, that is full implementation of an BoW approach.

Here is the part where they classify with pegasos and evaluate the results:

% --------------------------------------------------------------------
%                                                            Train SVM
% --------------------------------------------------------------------

lambda = 1 / (conf.svm.C *  length(selTrain)) ;
w = [] ;
for ci = 1:length(classes)
  perm = randperm(length(selTrain)) ;
  fprintf('Training model for class %s\n', classes{ci}) ;
  y = 2 * (imageClass(selTrain) == ci) - 1 ;
  data = vl_maketrainingset(psix(:,selTrain(perm)), int8(y(perm))) ;
  [w(:,ci) b(ci)] = vl_svmpegasos(data, lambda, ...
                                  'MaxIterations', 50/lambda, ...
                                  'BiasMultiplier', conf.svm.biasMultiplier) ;

  model.b = conf.svm.biasMultiplier * b ;
  model.w = w ;

% --------------------------------------------------------------------
%                                                Test SVM and evaluate
% --------------------------------------------------------------------

% Estimate the class of the test images
scores = model.w' * psix + model.b' * ones(1,size(psix,2)) ;
[drop, imageEstClass] = max(scores, [], 1) ;

% Compute the confusion matrix
idx = sub2ind([length(classes), length(classes)], ...
              imageClass(selTest), imageEstClass(selTest)) ;
confus = zeros(length(classes)) ;
confus = vl_binsum(confus, ones(size(idx)), idx) ;

来源：https://stackoverflow.com/questions/11091972/implementing-bags-of-words-object-recognition-using-vlfeat

标签

computer-vision

svm

sift