Image Processing: Algorithm Improvement for Real-Time FedEx Logo Detector

后端 未结 2 1509
半阙折子戏
半阙折子戏 2021-02-02 12:16

I\'ve been working on a project involving image processing for logo detection. Specifically, the goal is to develop an automated system for a real-time FedEx truck/logo detector

相关标签:
2条回答
  • 2021-02-02 12:58

    You can help the detector with preprocessing the image, then you don't need as many training images.

    First we reduce the barrel distortion.

    import cv2
    img = cv2.imread('fedex.jpg')
    margin = 150
    # add border as the undistorted image is going to be larger
    img = cv2.copyMakeBorder(
                     img, 
                     margin, 
                     margin, 
                     margin, 
                     margin, 
                     cv2.BORDER_CONSTANT, 
                     0)
    import numpy as np
    
    width  = img.shape[1]
    height = img.shape[0]
    distCoeff = np.zeros((4,1), np.float64)
    
    k1 = -4.5e-5;
    k2 = 0.0;
    p1 = 0.0;
    p2 = 0.0;
    
    distCoeff[0,0] = k1;
    distCoeff[1,0] = k2;
    distCoeff[2,0] = p1;
    distCoeff[3,0] = p2;
    
    cam = np.eye(3, dtype=np.float32)
    
    cam[0,2] = width/2.0  # define center x
    cam[1,2] = height/2.0 # define center y
    cam[0,0] = 12.        # define focal length x
    cam[1,1] = 12.        # define focal length y
    
    dst = cv2.undistort(img, cam, distCoeff)
    

    Then we transform the image in a way as if the camera is facing the FedEx truck right on. That is wherever along the curb the truck is parked, the FedEx logo will have almost the same size and orientation.

    # use four points for homography estimation, coordinated taken from undistorted image
    # 1. top-left corner of F
    # 2. bottom-left corner of F
    # 3. top-right of E
    # 4. bottom-right of E
    pts_src = np.array([[1083, 235], [1069, 343], [1238, 301],[1201, 454]])
    pts_dst = np.array([[1069, 235],[1069, 320],[1201, 235],[1201, 320]])
    h, status = cv2.findHomography(pts_src, pts_dst)
    im_out = cv2.warpPerspective(dst, h, (dst.shape[1], dst.shape[0]))
    
    0 讨论(0)
  • 2021-02-02 13:04

    You might want to take a look at feature matching. The goal is to find features in two images, a template image, and a noisy image and match them. This would allow you to find the template (the logo) in the noisy image (the camera image).

    A feature is, in essence, things that humans would find interesting in an image, such as corners or open spaces. I would recommend using a scale-invariant feature transform (SIFT) as a feature detection algorithm. The reason I suggest using SIFT is that it is invariant to image translation, scaling, and rotation, partially invariant to illumination changes and robust to local geometric distortion. This matches your specification.

    Example of feature detection

    I generated the above image using code modified from the OpenCV docs docs on SIFT feature detection:

    import numpy as np
    import cv2
    from matplotlib import pyplot as plt
    
    img = cv2.imread('main.jpg',0)  # target Image
    
    # Create the sift object
    sift = cv2.xfeatures2d.SIFT_create(700)
    
    # Find keypoints and descriptors directly
    kp, des = sift.detectAndCompute(img, None)
    
    # Add the keypoints to the final image
    img2 = cv2.drawKeypoints(img, kp, None, (255, 0, 0), 4)
    
    # Show the image
    plt.imshow(img2)
    plt.show()
    

    You will notice when doing this that a large number of the features do land on the FedEx logo (Above).

    The next thing I did was try matching the features from the video feed to the features in the FedEx logo. I did this using the FLANN feature matcher. You could have gone with many approaches (including brute force) but because you are working on a video feed this is probably your best option. The code below is inspired from the OpenCV docs on feature matching:

    import numpy as np
    import cv2
    from matplotlib import pyplot as plt
    
    logo = cv2.imread('logo.jpg', 0) # query Image
    img = cv2.imread('main2.jpg',0)  # target Image
    
    
    # Create the sift object
    sift = cv2.xfeatures2d.SIFT_create(700)
    
    # Find keypoints and descriptors directly
    kp1, des1 = sift.detectAndCompute(img, None)
    kp2, des2 = sift.detectAndCompute(logo,None)
    
    # FLANN parameters
    FLANN_INDEX_KDTREE = 1
    index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
    search_params = dict(checks=50)   # or pass empty dictionary
    flann = cv2.FlannBasedMatcher(index_params,search_params)
    matches = flann.knnMatch(des1,des2,k=2)
    
    # Need to draw only good matches, so create a mask
    matchesMask = [[0,0] for i in range(len(matches))]
    
    # ratio test as per Lowe's paper
    for i,(m,n) in enumerate(matches):
        if m.distance < 0.7*n.distance:
            matchesMask[i]=[1,0]
    
    # Draw lines
    draw_params = dict(matchColor = (0,255,0),
                       singlePointColor = (255,0,0),
                       matchesMask = matchesMask,
                       flags = 0)
    
    
    # Display the matches
    img3 = cv2.drawMatchesKnn(img,kp1,logo,kp2,matches,None,**draw_params)
    plt.imshow(img3, )
    plt.show()
    

    Using this I was able to get the following features matched as seen below. You will notice that there are outliers. However the majority of features match:

    The final step would then to be to simply draw a bounding box around this image. I will link you to another stack overflow question which does something similar but with the orb detector. Here is another way to get a bounding box using the OpenCV docs.

    I hope this helps!

    0 讨论(0)
提交回复
热议问题