Bag of words training and testing opencv, matlab

前端 未结 2 1344
深忆病人
深忆病人 2021-01-30 09:39

I\'m implementing Bag Of Words in opencv by using SIFT features in order to make a classification for a specific dataset. So far, I have been apple to cluster the descriptors an

2条回答
  •  难免孤独
    2021-01-30 10:13

    Local features

    When you work with SIFT, you usually want to extract local features. What does that means? You have your image and from this image you will locate points from which you will extract local feature vectors. A local feature vector is just a vector consisting of numerical values that describes the visual information of the image region from which it was extracted. Although the number of local feature vectors that you can extract from image A does not need to be the same as the number of feature vectors that you can extract from image B, the number components of a local feature vector (i.e. its dimensionality) is always the same.

    Now, if you want to use your local feature vectors to classify images you have a problem. In traditional image classification, each image is described by a global feature vector, which, in the context of machine learning, can be seen as a set of numerical attributes. However, when you extract a set of local feature vectors you don't have a global representation of each image which is required for image classification. A technique that can be employed to solve this problem is the bag of words, also known as bag of visual words (BoW).

    Bag of visual words

    Here's the (very) simplified BoW algorithm:

    1. Extract the SIFT local feature vectors from your set of images;

    2. Put all this local feature vectors into a single set. At this point you don't even need to store from which image each local feature vector was extracted;

    3. Apply a clustering algorithm (e.g. k-means) over the set of local feature vectors in order to find centroid coordinates and assign an id to each centroid. This set of centroids will be your vocabulary;

    4. The global feature vector will be a histogram that counts how many times each centroid occurred in each image. To compute the histogram find the nearest centroid for each local feature vector.

    Image Classification

    Here I am assuming that your problem is the following:

    You have as input a set of labeled images and a set of non-labeled images which you want to assign a label based on its visual appearance. Suppose your problem is to classify landscape photography. You image labels could be, for example, “mountains”, “beach” or “forest”.

    The global feature vector extracted from each image (i.e. its bag of visual words) can be seen as a set of numerical attributes. This set of numerical attributes representing the visual characteristics of each image and the corresponding image labels can be used to train classifier. For example, you could use a data mining software such as Weka, which has an implementation of SVM, known as SMO, to solve your problem.

    Basically, you only have to format the global feature vectors and corresponding image labels according to the ARFF file format, which is, basically, a CSV of global feature vectors followed by image label.

提交回复
热议问题