please i like to classify a set of image in 4 class with SIFT DESCRIPTOR and SVM. Now, using SIFT extractor I get keypoints of different sizes exemple img1 have 100 keypoin
You will always get different number of keypoints for different images, but the size of feature vector of each descriptor point remains same i.e. 128. People prefer using Vector Quantization or K-Mean Clustering and build Bag-of-Words model histogram. You can have a look at this thread.
In this case, perhaps dense sift is a good choice.
There are two main stages:
Stage 1: Creating a codebook.
k
. Each image will produce a matrix Vi (i <= n
and n
is the number of images used to create the codeword.) of size 128 * m
, where m
is the number of key points gathered from the image. The input to K-means is therefore, a big matrix V created by horizontal concatenation of Vi, for all i
. The output of K-means is a matrix C with size 128 * k
.Stage 2: Calculating Histograms.
For each image in the dataset, do the following:
h
of size k
and initialize it to zeros.h
by 1.h
by L1 or L2 norms.Now h
is ready for classification.
Another possibility is to use Fisher's vector instead of a codebook, https://hal.inria.fr/file/index/docid/633013/filename/jegou_aggregate.pdf
Using the conventional SIFT approach you will never have the same number of key points in every image. One way of achieving that is to sample the descriptors densely, using Dense SIFT, that places a regular grid on top of the image. If all images have the same size, then you will have the same number of key points per image.