Detecting if an object from one image is in another image with OpenCV

前端 未结 4 1928
别那么骄傲
别那么骄傲 2021-01-01 00:48

I have a sample image which contains an object, such as the earrings in the following image:

http://imgur.com/luj2Z

I then have a large candidate set of imag

相关标签:
4条回答
  • 2021-01-01 01:26

    Fortunately, the kind guys from OpenCV just did that for you. Check in your samples folder "opencv\samples\cpp\matching_to_many_images.cpp". Compile and give it a try wih the default images.

    The algorithm can be easily adapted to make it faster or more precise.

    Mainly, object recognition algorithms are split in two parts: keypoint detection& description adn object matching. For both of them there are many algorithms/variants, with wich you can play directly into OpenCV.

    Detection/description can be done by: SIFT/SURF/ORB/GFTT/STAR/FAST and others.

    For matching you have: brute force, hamming, etc. (Some methods are specific for a given detection algorithm)

    HINTS to start:

    • crop your original image so the interesting object covers as much as possible of the image area. Use it as training.

    • SIFT is the most accurate and the laziest descriptor. FAST is a good combination of precision and accuracy. GFTT is old and quite unreliable. ORB is newly added to OPENCV and is very promising, both in speed and accuracy.

    • The results depend on the pose of the object in the other image. If it is resized, rotated, squeezed, partly covered, etc, try SIFT. if it is a simple task (i.e. it appears at the almost same size/rotation/etc, most of the descriptors will cope well)
    • ORB may not be yet in the OpenCV release. Try to download the latest from openCV trunk and compile it https://code.ros.org/svn/opencv/trunk

    So, you can find the best combination for you by trial and error.

    For the details of every implementation, you should read the original papers/tutorials. google scholar is a good start

    0 讨论(0)
  • 2021-01-01 01:31

    As said, algorithms like SIFT and SURF contain a feature point, which is invariant to a number of distortions and a descriptor, which aims to robustly model the feature point its surroundings.

    The latter is increasingly used for image categorization and identification in what is commonly known as the "bag of word" or "visual words" approach.

    In the most simple form, one can collect all data from all descriptors from all images and cluster them, for example using k-means. Every original image then has descriptors that contribute to a number of clusters. The centroids of these clusters, i.e. the visual words, can be used as a new descriptor for the image. These can then be used in an architecture with an inverted file design.

    This approach allows for soft matching and for a certain amount generalization, e.g retrieve all images with airplanes.

    • The VLfeat website contains, next to an excellent SIFT library, a nice demo of this approach, classifying the caltech 101 dataset.

    • Caltech itself offers Matlab/C++ software together with relevant publications.

    • Also a good start is the work by LEAR

    0 讨论(0)
  • 2021-01-01 01:42

    Check out the SURF features, which are a part of openCV. The idea here is that you have an algorithm for finding "interest points" in two images. You also have an algorithm for computing a descriptor of an image patch around each interest point. Typically this descriptor captures the distribution of edge orientations in the patch. Then you try to find point correspondences, i. e. for each interest point in image A try to find a corresponding interest point in image B. This is accomplished by comparing the descriptors, and looking for the closest matches. Then, if you have a set of correspondences that are related by some geometric transformation, you have a detection.

    Of course, this is a very high level explanation. The devil is in the details, and for those you should read some papers. Start with Distinctive image features from scale-invariant keypoints by David Lowe, and then read the papers on SURF.

    Also, consider moving this question to Signal and Image Processing Stack Exchange

    0 讨论(0)
  • 2021-01-01 01:48

    In case someone comes along in the future, here's a small sample doing this with openCV. It's based on the opencv sample, but (in my opinion), this is a bit clearer, so I'm including it as well.

    Tested with openCV 2.4.4

    #!/usr/bin/env python
    
    '''
    Uses SURF to match two images.
      Finds common features between two images and draws them
    
    Based on the sample code from opencv:
      samples/python2/find_obj.py
    
    USAGE
      find_obj.py <image1> <image2>
    '''
    
    import sys
    
    import numpy
    import cv2
    
    
    ###############################################################################
    # Image Matching
    ###############################################################################
    
    def match_images(img1, img2, img1_features=None, img2_features=None):
        """Given two images, returns the matches"""
        detector = cv2.SURF(3200)
        matcher = cv2.BFMatcher(cv2.NORM_L2)
    
        if img1_features is None:
            kp1, desc1 = detector.detectAndCompute(img1, None)
        else:
            kp1, desc1 = img1_features
    
        if img2_features is None:
            kp2, desc2 = detector.detectAndCompute(img2, None)
        else:
            kp2, desc2 = img2_features
    
        #print 'img1 - %d features, img2 - %d features' % (len(kp1), len(kp2))
    
        raw_matches = matcher.knnMatch(desc1, trainDescriptors=desc2, k=2)
        kp_pairs = filter_matches(kp1, kp2, raw_matches)
        return kp_pairs
    
    
    def filter_matches(kp1, kp2, matches, ratio=0.75):
        """Filters features that are common to both images"""
        mkp1, mkp2 = [], []
        for m in matches:
            if len(m) == 2 and m[0].distance < m[1].distance * ratio:
                m = m[0]
                mkp1.append(kp1[m.queryIdx])
                mkp2.append(kp2[m.trainIdx])
        kp_pairs = zip(mkp1, mkp2)
        return kp_pairs
    
    
    ###############################################################################
    # Match Diplaying
    ###############################################################################
    
    def draw_matches(window_name, kp_pairs, img1, img2):
        """Draws the matches"""
        mkp1, mkp2 = zip(*kp_pairs)
    
        H = None
        status = None
    
        if len(kp_pairs) >= 4:
            p1 = numpy.float32([kp.pt for kp in mkp1])
            p2 = numpy.float32([kp.pt for kp in mkp2])
            H, status = cv2.findHomography(p1, p2, cv2.RANSAC, 5.0)
    
        if len(kp_pairs):
            explore_match(window_name, img1, img2, kp_pairs, status, H)
    
    
    def explore_match(win, img1, img2, kp_pairs, status=None, H=None):
        """Draws lines between the matched features"""
        h1, w1 = img1.shape[:2]
        h2, w2 = img2.shape[:2]
        vis = numpy.zeros((max(h1, h2), w1 + w2), numpy.uint8)
        vis[:h1, :w1] = img1
        vis[:h2, w1:w1 + w2] = img2
        vis = cv2.cvtColor(vis, cv2.COLOR_GRAY2BGR)
    
        if H is not None:
            corners = numpy.float32([[0, 0], [w1, 0], [w1, h1], [0, h1]])
            reshaped = cv2.perspectiveTransform(corners.reshape(1, -1, 2), H)
            reshaped = reshaped.reshape(-1, 2)
            corners = numpy.int32(reshaped + (w1, 0))
            cv2.polylines(vis, [corners], True, (255, 255, 255))
    
        if status is None:
            status = numpy.ones(len(kp_pairs), numpy.bool_)
        p1 = numpy.int32([kpp[0].pt for kpp in kp_pairs])
        p2 = numpy.int32([kpp[1].pt for kpp in kp_pairs]) + (w1, 0)
    
        green = (0, 255, 0)
        red = (0, 0, 255)
        for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
            if inlier:
                col = green
                cv2.circle(vis, (x1, y1), 2, col, -1)
                cv2.circle(vis, (x2, y2), 2, col, -1)
            else:
                col = red
                r = 2
                thickness = 3
                cv2.line(vis, (x1 - r, y1 - r), (x1 + r, y1 + r), col, thickness)
                cv2.line(vis, (x1 - r, y1 + r), (x1 + r, y1 - r), col, thickness)
                cv2.line(vis, (x2 - r, y2 - r), (x2 + r, y2 + r), col, thickness)
                cv2.line(vis, (x2 - r, y2 + r), (x2 + r, y2 - r), col, thickness)
        vis0 = vis.copy()
        for (x1, y1), (x2, y2), inlier in zip(p1, p2, status):
            if inlier:
                cv2.line(vis, (x1, y1), (x2, y2), green)
    
        cv2.imshow(win, vis)
    
    ###############################################################################
    # Test Main
    ###############################################################################
    
    if __name__ == '__main__':
        if len(sys.argv) < 3:
            print "No filenames specified"
            print "USAGE: find_obj.py <image1> <image2>"
            sys.exit(1)
    
        fn1 = sys.argv[1]
        fn2 = sys.argv[2]
    
        img1 = cv2.imread(fn1, 0)
        img2 = cv2.imread(fn2, 0)
    
        if img1 is None:
            print 'Failed to load fn1:', fn1
            sys.exit(1)
    
        if img2 is None:
            print 'Failed to load fn2:', fn2
            sys.exit(1)
    
        kp_pairs = match_images(img1, img2)
    
        if kp_pairs:
            draw_matches('find_obj', kp_pairs, img1, img2)
        else:
            print "No matches found"
    
        cv2.waitKey()
        cv2.destroyAllWindows()
    
    0 讨论(0)
提交回复
热议问题