I have a sample image which contains an object, such as the earrings in the following image:
http://imgur.com/luj2Z
I then have a large candidate set of imag
Fortunately, the kind guys from OpenCV just did that for you. Check in your samples folder "opencv\samples\cpp\matching_to_many_images.cpp". Compile and give it a try wih the default images.
The algorithm can be easily adapted to make it faster or more precise.
Mainly, object recognition algorithms are split in two parts: keypoint detection& description adn object matching. For both of them there are many algorithms/variants, with wich you can play directly into OpenCV.
Detection/description can be done by: SIFT/SURF/ORB/GFTT/STAR/FAST and others.
For matching you have: brute force, hamming, etc. (Some methods are specific for a given detection algorithm)
HINTS to start:
crop your original image so the interesting object covers as much as possible of the image area. Use it as training.
SIFT is the most accurate and the laziest descriptor. FAST is a good combination of precision and accuracy. GFTT is old and quite unreliable. ORB is newly added to OPENCV and is very promising, both in speed and accuracy.
So, you can find the best combination for you by trial and error.
For the details of every implementation, you should read the original papers/tutorials. google scholar is a good start
As said, algorithms like SIFT and SURF contain a feature point, which is invariant to a number of distortions and a descriptor, which aims to robustly model the feature point its surroundings.
The latter is increasingly used for image categorization and identification in what is commonly known as the "bag of word" or "visual words" approach.
In the most simple form, one can collect all data from all descriptors from all images and cluster them, for example using k-means. Every original image then has descriptors that contribute to a number of clusters. The centroids of these clusters, i.e. the visual words, can be used as a new descriptor for the image. These can then be used in an architecture with an inverted file design.
This approach allows for soft matching and for a certain amount generalization, e.g retrieve all images with airplanes.
The VLfeat website contains, next to an excellent SIFT library, a nice demo of this approach, classifying the caltech 101 dataset.
Caltech itself offers Matlab/C++ software together with relevant publications.
Also a good start is the work by LEAR
Check out the SURF features, which are a part of openCV. The idea here is that you have an algorithm for finding "interest points" in two images. You also have an algorithm for computing a descriptor of an image patch around each interest point. Typically this descriptor captures the distribution of edge orientations in the patch. Then you try to find point correspondences, i. e. for each interest point in image A try to find a corresponding interest point in image B. This is accomplished by comparing the descriptors, and looking for the closest matches. Then, if you have a set of correspondences that are related by some geometric transformation, you have a detection.
Of course, this is a very high level explanation. The devil is in the details, and for those you should read some papers. Start with Distinctive image features from scale-invariant keypoints by David Lowe, and then read the papers on SURF.
Also, consider moving this question to Signal and Image Processing Stack Exchange
In case someone comes along in the future, here's a small sample doing this with openCV. It's based on the opencv sample, but (in my opinion), this is a bit clearer, so I'm including it as well.
Tested with openCV 2.4.4
#!/usr/bin/env python
'''
Uses SURF to match two images.
Finds common features between two images and draws them
Based on the sample code from opencv:
samples/python2/find_obj.py
USAGE
find_obj.py <image1> <image2>
'''
import sys
import numpy
import cv2
###############################################################################
# Image Matching
###############################################################################
def match_images(img1, img2, img1_features=None, img2_features=None):
"""Given two images, returns the matches"""
detector = cv2.SURF(3200)
matcher = cv2.BFMatcher(cv2.NORM_L2)
if img1_features is None:
kp1, desc1 = detector.detectAndCompute(img1, None)
else:
kp1, desc1 = img1_features
if img2_features is None:
kp2, desc2 = detector.detectAndCompute(img2, None)
else:
kp2, desc2 = img2_features
#print 'img1 - %d features, img2 - %d features' % (len(kp1), len(kp2))
raw_matches = matcher.knnMatch(desc1, trainDescriptors=desc2, k=2)
kp_pairs = filter_matches(kp1, kp2, raw_matches)
return kp_pairs
def filter_matches(kp1, kp2, matches, ratio=0.75):
"""Filters features that are common to both images"""
mkp1, mkp2 = [], []
for m in matches:
if len(m) == 2 and m[0].distance < m[1].distance * ratio:
m = m[0]
mkp1.append(kp1[m.queryIdx])
mkp2.append(kp2[m.trainIdx])
kp_pairs = zip(mkp1, mkp2)
return kp_pairs
###############################################################################
# Match Diplaying
###############################################################################
def draw_matches(window_name, kp_pairs, img1, img2):
"""Draws the matches"""
mkp1, mkp2 = zip(*kp_pairs)
H = None
status = None
if len(kp_pairs) >= 4:
p1 = numpy.float32([kp.pt for kp in mkp1])
p2 = numpy.float32([kp.pt for kp in mkp2])
H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)
if len(kp_pairs):
explore_match(window_name, img1, img2, kp_pairs, status, H)
def explore_match(win, img1, img2, kp_pairs, status=None, H=None):
"""Draws lines between the matched features"""
h1, w1 = img1.shape[:2]
h2, w2 = img2.shape[:2]
vis = numpy.zeros((max(h1, h2), w1 + w2), numpy.uint8)
vis[:h1, :w1] = img1
vis[:h2, w1:w1 + w2] = img2
vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)
if H is not None:
corners = numpy.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
reshaped = cv2.perspectiveTransform(corners.reshape(1, -1, 2), H)
reshaped = reshaped.reshape(-1, 2)
corners = numpy.int32(reshaped + (w1, 0))
cv2.polylines(vis, [corners], True, (255, 255, 255))
if status is None:
status = numpy.ones(len(kp_pairs), numpy.bool_)
p1 = numpy.int32([kpp[0].pt for kpp in kp_pairs])
p2 = numpy.int32([kpp[1].pt for kpp in kp_pairs]) + (w1, 0)
green = (0, 255, 0)
red = (0, 0, 255)
for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
if inlier:
col = green
cv2.circle(vis, (x1, y1), 2, col, -1)
cv2.circle(vis, (x2, y2), 2, col, -1)
else:
col = red
r = 2
thickness = 3
cv2.line(vis, (x1 - r, y1 - r), (x1 + r, y1 + r), col, thickness)
cv2.line(vis, (x1 - r, y1 + r), (x1 + r, y1 - r), col, thickness)
cv2.line(vis, (x2 - r, y2 - r), (x2 + r, y2 + r), col, thickness)
cv2.line(vis, (x2 - r, y2 + r), (x2 + r, y2 - r), col, thickness)
vis0 = vis.copy()
for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
if inlier:
cv2.line(vis, (x1, y1), (x2, y2), green)
cv2.imshow(win, vis)
###############################################################################
# Test Main
###############################################################################
if __name__ == '__main__':
if len(sys.argv) < 3:
print "No filenames specified"
print "USAGE: find_obj.py <image1> <image2>"
sys.exit(1)
fn1 = sys.argv[1]
fn2 = sys.argv[2]
img1 = cv2.imread(fn1, 0)
img2 = cv2.imread(fn2, 0)
if img1 is None:
print 'Failed to load fn1:', fn1
sys.exit(1)
if img2 is None:
print 'Failed to load fn2:', fn2
sys.exit(1)
kp_pairs = match_images(img1, img2)
if kp_pairs:
draw_matches('find_obj', kp_pairs, img1, img2)
else:
print "No matches found"
cv2.waitKey()
cv2.destroyAllWindows()