Image Processing: Algorithm Improvement for 'Coca-Cola Can' Recognition

前端 未结 23 1249
后悔当初
后悔当初 2020-11-22 09:26

One of the most interesting projects I\'ve worked on in the past couple of years was a project about image processing. The goal was to develop a system to be able to recogni

相关标签:
23条回答
  • 2020-11-22 09:59

    I would detect red rectangles: RGB -> HSV, filter red -> binary image, close (dilate then erode, known as imclose in matlab)

    Then look through rectangles from largest to smallest. Rectangles that have smaller rectangles in a known position/scale can both be removed (assuming bottle proportions are constant, the smaller rectangle would be a bottle cap).

    This would leave you with red rectangles, then you'll need to somehow detect the logos to tell if they're a red rectangle or a coke can. Like OCR, but with a known logo?

    0 讨论(0)
  • 2020-11-22 09:59

    I like the challenge and wanted to give an answer, which solves the issue, I think.

    1. Extract features (keypoints, descriptors such as SIFT, SURF) of the logo
    2. Match the points with a model image of the logo (using Matcher such as Brute Force )
    3. Estimate the coordinates of the rigid body (PnP problem - SolvePnP)
    4. Estimate the cap position according to the rigid body
    5. Do back-projection and calculate the image pixel position (ROI) of the cap of the bottle (I assume you have the intrinsic parameters of the camera)
    6. Check with a method whether the cap is there or not. If there, then this is the bottle

    Detection of the cap is another issue. It can be either complicated or simple. If I were you, I would simply check the color histogram in the ROI for a simple decision.

    Please, give the feedback if I am wrong. Thanks.

    0 讨论(0)
  • 2020-11-22 09:59

    Maybe too many years late, but nevertheless a theory to try.

    The ratio of bounding rectangle of red logo region to the overall dimension of the bottle/can is different. In the case of Can, should be 1:1, whereas will be different in that of bottle (with or without cap). This should make it easy to distinguish between the two.

    Update: The horizontal curvature of the logo region will be different between the Can and Bottle due their respective size difference. This could be specifically useful if your robot needs to pick up can/bottle, and you decide the grip accordingly.

    0 讨论(0)
  • 2020-11-22 10:01

    Looking at shape

    Take a gander at the shape of the red portion of the can/bottle. Notice how the can tapers off slightly at the very top whereas the bottle label is straight. You can distinguish between these two by comparing the width of the red portion across the length of it.

    Looking at highlights

    One way to distinguish between bottles and cans is the material. A bottle is made of plastic whereas a can is made of aluminum metal. In sufficiently well-lit situations, looking at the specularity would be one way of telling a bottle label from a can label.

    As far as I can tell, that is how a human would tell the difference between the two types of labels. If the lighting conditions are poor, there is bound to be some uncertainty in distinguishing the two anyways. In that case, you would have to be able to detect the presence of the transparent/translucent bottle itself.

    0 讨论(0)
  • 2020-11-22 10:02

    Deep Learning

    Gather at least a few hundred images containing cola cans, annotate the bounding box around them as positive classes, include cola bottles and other cola products label them negative classes as well as random objects.

    Unless you collect a very large dataset, perform the trick of using deep learning features for small dataset. Ideally using a combination of Support Vector Machines(SVM) with deep neural nets.

    Once you feed the images to a previously trained deep learning model(e.g. GoogleNet), instead of using neural network's decision (final) layer to do classifications, use previous layer(s)' data as features to train your classifier.

    OpenCV and Google Net: http://docs.opencv.org/trunk/d5/de7/tutorial_dnn_googlenet.html

    OpenCV and SVM: http://docs.opencv.org/2.4/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.html

    0 讨论(0)
  • 2020-11-22 10:03

    Please take a look at Zdenek Kalal's Predator tracker. It requires some training, but it can actively learn how the tracked object looks at different orientations and scales and does it in realtime!

    The source code is available on his site. It's in MATLAB, but perhaps there is a Java implementation already done by a community member. I have succesfully re-implemented the tracker part of TLD in C#. If I remember correctly, TLD is using Ferns as the keypoint detector. I use either SURF or SIFT instead (already suggested by @stacker) to reacquire the object if it was lost by the tracker. The tracker's feedback makes it easy to build with time a dynamic list of sift/surf templates that with time enable reacquiring the object with very high precision.

    If you're interested in my C# implementation of the tracker, feel free to ask.

    0 讨论(0)
提交回复
热议问题