How to classify blurry numbers with openCV

后端未结

关注

 2  743

无人及你

I would like to capture the number from this kind of picture.

I tried multi-scale matching from the following link.

http://www.pyimagesearch.com/2015/

相关标签:

2条回答

隐瞒了意图╮

2021-01-31 05:18
Classifying Digits

You clarified in comments that you've already isolated the number part of the image pre-detection, so I'll start under that assumption.

Perhaps you can approximate the perspective effects and "blurriness" of the number by treating it as a hand-written number. In this case, there is a famous data-set of handwritten numerals for classification training called mnist.

Yann LeCun has enumerated the state of the art on this dataset here mnist hand-written dataset.

At the far end of the spectrum, convolutional neural networks yield outrageously low error rates (fractions of 1% error). For a simpler solution, k-nearest neighbours using deskewing, noise removal, blurring, and 2 pixel shift, yielded about 1% error, and is significantly faster to implement. Python opencv has an implementation. Neural networks and support vector machines with deskewing also have some pretty impressive performance rates.

Note that convolutional networks don't have you pick your own features, so the important color-differential information here might just be used for narrowing the region-of-interest. Other approaches, where you define your feature space, might incorporate the known color difference more precisely.

Python supports a lot of machine learning techniques in the terrific package sklearn - here are examples of sklearn applied to mnist. If you're looking for an tutorialized explanation of machine learning in python, sklearn's own tutorial is very verbose

From the sklearn link:

Those are the kinds of items you're trying to classify if you learn using this approach. To emphasize how easy it is to start training some of these machine learning-based classifiers, here is an abridged section from the example code in the linked sklearn package:
```
digits = datasets.load_digits() # built-in to sklearn!
data = digits.images.reshape((len(digits.images), -1))

# Create a classifier: a support vector classifier
classifier = svm.SVC(gamma=0.001)

# We learn the digits on the first half of the digits
classifier.fit(data[:n_samples / 2], digits.target[:n_samples / 2])
```
If you're wedded to openCv (possibly because you want to port to a real-time system in the future), opencv3/python has a tutorial on this exact topic too! Their demo uses k-nearest-neighbor (listed in the LeCun page), but they also have svms and the many of the other tools in sklearn. Their ocr page using SVMs uses deskewing, which might be useful with the perspective effect in your problem:

UPDATE: I used the out-of-the box skimage approach outlined above on your image, heavily cropped, and it correctly classified it. A lot more testing would be required to see if this is rhobust in practice

^^ That tiny image is the 8x8 crop of the image you embedded in your question. mnist is 8x8 images. That's why it trains in less than a second with default arguments in skimage.

I converted it the correct format by scaling it up to the mnist range using
```
number = scipy.misc.imread("cropped_image.png")
datum  =  (number[:,:,0]*15).astype(int).reshape((64,))
classifier.predict(datum) # returns 8
```
I didn't change anything else from the example; here, I'm only using the first channel for classification, and no smart feature computation. 15 looked about right to me; you'll need to tune it to get within the target range or (ideally) provide your own training and testing set

Object Detection

If you haven't isolated the number in the image you'll need an object detector. The literature space on this problem is gigantic and I won't start down that rabbit hole (google Viola and Jones, maybe?) This blog covers the fundamentals of a "sliding window" detector in python. Adrian Rosebrock looks like he's even a contributor on SO, and that page has some good examples of opencv and python-based object detectors fairly tutorialized (you actually linked to that blog in your question, I didn't realize).

In short, classify windows across the image and pick the window of highest confidence. Narrowing down the search space with a region of interest will of course yield huge improvements in all areas of performance
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2021-01-31 05:36
You have a couple of things you can use to your advantage:
- The number is within the black rectangular bezel and one colour
- The number appears to be a segmented LCD type display, if so there are only a finite number of segments which are off or on.
So I suggest you:
- Calibrate your camera and preprocess the image to remove lens distortion
- Rectify the display rectangle:
  - Detect the display rectangle using either the intersection of hough lines, or edge detection followed by contour detection and then pick the biggest, squarest contours
  - use GetPerspectiveTransform to get the transform between image coordinates and an ideal rectangle, then transform the input image using WarpPerspective
- Split image into R, G and B channels and work out r - avg(g, b), this is a bit lighting dependent but should give something like this:
- Then either try pattern matching on this, or perhaps re-segment the image and attempt to find which display segments are lit, or run through an OCR package.
0 讨论(0)
发布评论:

提交评论
- 加载中...