word segmentation using opencv [closed]

点点圈 提交于 2019-11-27 12:33:48

问题


I am working on some scanned text images and I need to highlight all the words in that image.I know the problem is equivalent to finding subimages with extra whitespaces around them.

OCR cannot be used and I just need to outline each word with a border. Can someone suggest how it might be done using OpenCV.

I have tried reading about thresholding and segmenting.I am just looking for someone to point me to some relevant material.


回答1:


I think your image has a multiline text. In that case, first you have to do is to detect these lines.

For that, first binarize the image using Otsu's method or adaptive thresholding.

Then,you can use something what is called as "Horizontal histogram". It is like a histogram itself, but shows where there are lines and where there are blank spaces. So devide the images at blank lines, and you get each line. Below is the image of a horizontal histogram.

Now for each line, find horizontal histogram. Before that, try to do some dilatation and erosion, so that all letters are grouped together. Then you can find connected components on each line to get each word. Then draw boundaries.

Below image shows both horizontal and vertical histograms:

This SOF might help : How to convert an image into character segments?



来源:https://stackoverflow.com/questions/12764624/word-segmentation-using-opencv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!