The art of image processing is (in my 10+ years experience) just that: an art. No single answer exists, and there is always more than one way to do it. And it will definitely fail in some cases.
In my experience of working on automatically detecting features in medical images, it takes a long time to build to reliable algorithm, but in hindsight the best result is obtained with a relative simple algorithm. However, it takes a lot of time to get to this simple algorithm.
To get to this, the general approach is always the same:
- get started is to build up a large database of test-images (at least 100). This defines 'normal' images which should work. By collecting the images you already start thinking about the problem.
- annotate the images to build a kind of 'ground truth'. In this case, the 'ground truth' should contain the 4 corners of the card since these are the interesting points.
- create an application which runs over these images an algorithm and compares the result with the ground truth. In this case, the 'comparing with ground truth' would be to take the mean distance of the found 4 corner point with the ground truth corner points.
- Output a tab-delimited file which you call .xls, and therefore can be opened (on Windows) in Excel by double clicking. Good to get an quick overview of the cases. Look at the worst cases first. Then open these cases manually to try to understand why they do not work.
- Now you are ready to change the algorithm. Change something, and re-run. Compare new Excel sheet to old Excel sheet. Now you start realizing the trade-offs you have to make.
That having said, I think that you need to answer these questions during the tuning of the algorithm:
- Do you allow a little folded cards? So no completely straight lines? If so, concentrate more on corners instead of lines / edges.
- Do you allow gradual differences in lighting? If so, a local contrast-stretch filter might help.
- Do you allow the same color for the card as the background? If so, you have to concentrate on the contents of the card instead of the border of the card.
- Do you allow non-perfect lenses? If so, to which extend?
- Do you allow rotated cards? If so, to which extend?
- Should the background be uniform in color and/or texture?
- How small should the smallest detectable card be relative to the image size? If you assume that at least 80% of the width or height should be covered, you get robustness back.
- If more than one card is visible in the image, should the algorithm be robust and only pick one, or is any output ok?
- If no card is visible, should it detect this case? Building in detection of this case will make it more user friendly ('no card found'), but also less robust.
These will make the requirements and assumptions on the image to acquire. Assumptions on which you can rely are very strong: they make the algorithm fast, robust and simple if you choose the right ones. Also let these requirements and assumptions be part of the testing database.
So what would I choose? Based on the three images you provided I would start with something like:
- Assume the cards are filling the image from 50% to 100%.
- Assume the cards are rotated at most 10 degrees or so.
- Assume the corners are well visible.
- Assume the aspect ratio (height divided by width) of the cards to be between 1/3 and 3.
- Assume no card-like objects in the background
The algorithm then would look like:
- Detect in each quadrant of the image a specific corner with a corner-filter. So in the upper left quadrant of the image the upper left corner of the card. Look for example at http://www.ee.surrey.ac.uk/CVSSP/demos/corners/results3.html , or use an OpenCV function for it like
cornerHarris
.
- To be more robust, calculate more than one corner per quadrant.
Try to build parallelograms with one corner per each quadrant by combining points from each quadrant. Create a fitness function which gives higher score to:
- having internal angles close to 90 degrees
- be large
- optionally, compare the corners of the card based on lighting or another feature.
This fitness function gives a lot of tuning possibilities later on.
Return the parallelogram with the highest score.
So why using corner-detection instead of a hough-transform to do line detection? In my opinion the hough-transform is (next to being slow) quite sensitive to patterns in the background (which is what you see in your first image -- it detects a stronger line in the background then of the card), and it cannot handle a little curved lines that well, unless you use a larger bin size which will worsen the detection.
Good luck!