Remove background color in image processing for OCR

前端未结

关注

 6  2065

I am trying to remove background color so as to improve the accuracy of OCR against images. A sample would look like below:

相关标签:

6条回答

甜味超标

2021-02-01 11:25

The following shows a possible strategy for processing your image, and OCR it

The last step is doing an OCR. My OCR routine is VERY basic, so I'm sure you may get better results.

The code is Mathematica code.

Not bad at all!

0 讨论(0)
发布评论:

提交评论
- 加载中...
Happy的楠姐

2021-02-01 11:27

If your image is captured as RGB, just use the green image or quickly convert the bayer pattern which is probably @misha's convert to greyscale solutions probably do.

0 讨论(0)
发布评论:

提交评论
- 加载中...
我在风中等你

2021-02-01 11:31
Hope this helps someone

Using one line code you can get is using OpenCV and python
```
#Load image as Grayscale
im = cv2.imread('....../Downloads/Gd3oN.jpg',0)
#Use Adaptivethreshold with Gaussian
th = cv2.adaptiveThreshold(im,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,cv2.THRESH_BINARY,11,2)
```
Here's the result

Here's the link for Image Thresholding in OpenCV
0 讨论(0)
发布评论:

提交评论
- 加载中...
长情又很酷

2021-02-01 11:37
You can do this using GIMP (or any other image editing tool).
1. Open your image
2. Convert to grayscale
3. Duplicate the layer
4. Apply Gaussian blur using a large kernel (10x10) to the top layer
5. Calculate the image difference between the top and bottom layer
6. Threshold the image to yield a binary image
Blurred image:

Difference image:

Binary:

If you're doing it as a once-off, GIMP is probably good enough. If you expect to do this many times over, you could probably write an imagemagick script or code up your approach using something like Python and OpenCV.

Some problems with the above approach:
- The purple text (CENTURY) gets lost because it isn't as contrasting as the other text. You could work your way around it by thresholding different parts of the image differently, or by using local histogram manipulation methods
0 讨论(0)
发布评论:

提交评论
- 加载中...

悲哀的现实

2021-02-01 11:47

In Imagemagick, you can use the -lat function to do that.

convert image.jpg -colorspace gray -negate -lat 50x50+5% -negate result.jpg

convert image.jpg -colorspace HSB -channel 2 -separate +channel \
-white-threshold 35% \
-negate -lat 50x50+5% -negate \
-morphology erode octagon:1 result2.jpg

0 讨论(0)

孤独总比滥情好

2021-02-01 11:52

You can apply blur to the image, so you get almost clear background. Then divide each color component of each pixel of original image by the corresponding component of pixel on the background. And you will get text on white background. Additional postprocessing can help further.

This method works in the case if text is darker then the background (in each color component). Otherwise you can invert colors and apply this method.

0 讨论(0)
发布评论:

提交评论
- 加载中...