问题
I am using HOG feature detector based on SVM classification. I can successfully extract license plate, but the extracted number plate have some unnecessary pixels/lines apart from license number. My image processing pipeline is as follows:
- Applying HOG detector on the grayscale image
- Cropping detected region
- Re-sizing the cropped image
Applying adaptive threshold to highlight the plate numbers & filtering background using following Opencv code
cvAdaptiveThreshold(cropped_plate, thresholded_plate, 255,CV_ADAPTIVE_THRESH_GAUSSIAN_C, CV_THRESH_BINARY_INV,11, 9);
De-skewing plate image
Due to this unnecessary information, Tesseract-OCR software is getting confused to recognize numbers correctly. The extracted number plates images look like the following.
How can i filter these unnecessary pixels/lines from the images? Any help will be appreciated.
回答1:
You want to remove all non-text objects in the image. To do that, I suggest sorting the blobs by area of their bounding box (maxy - miny)*(maxx - minx). Do some statistical analysis; you know you are looking for objects of a similar size. Once you identify the approximate size of a character, make a larger bounding box that estimates the whole text. Keep the small blobs inside it, so for your picture, the dash sign will be preserved.
回答2:
You can probably achieve a lot by filtering contours out. Try to find contours that have a certain width/height ratio, a certain amount of white pixels with countNonZero() etc. If that does not help, you can always try to implement a text detection algorithm like Run Length Smoothing Algorithm (RLSA).
来源:https://stackoverflow.com/questions/26880788/removing-extra-pixels-lines-from-license-plate