How to find blank field on scanned document image

a 夏天 提交于 2019-12-07 20:18:17

问题


I want my application to fill in a single field in a form that exists as an black-and-white image file. The form always starts as the same paper version, but by the time my application gets it from my users, it may have been scanned or faxed more than once. Because of that, the field I need is not in the same place in every file.

My users do not always get the blank form from me, so I do not have the ability to print a mark or placeholder that I can recognize later.

There is text on the original blank form, but because it may have been faxed, I have only 200 dpi of resolution. The text is always big enough for a human to read, but I'm skeptical about OCR.

I have some budget so I do not need a free solution ... let's just say $2000.

That said, I am considering

  1. Get an OCR solution to find the text label on the field I need. I do not think I have the resources or expertise to roll-my-own. I do not need perfect recognition, since I already know what the text says. But I do need to know X- and Y-coordinates. Is there software that does this? Or is the programming easier than I think?

  2. Build or buy software to recognize the edges of the form. From there, I could get the relative position of the field I need. I'm thinking of the dashed line my scanner software puts around the image of a small document. Is that a known algorhthm or is there an available solution?

  3. Some other way to recognize the field I need. Attempts to google form filling software give me hundreds of matches for web forms, pdf forms, etc. that do not do what I need.

I'm not picky about language. My application runs on Linux, but if the best solution is Microsoft, I can probably make that work.

I'd appreciate your thoughts.


回答1:


If I understand correctly, the form is always the same, but may be shifted, scaled, or slightly rotated due to photocopying/faxing. In that case, your problem is one of image registration: find the optimal rigid transformation that makes a form from a user line up with your "model" form, in which you know the location of the field of interest. Once you know the transformation, you can compute the location of the field in the user's form.

There are many image registration algorithms, typically developed for applications such as aligning MR-images of the brain. They are computationally expensive and require statistical priors. Fortunately, your case is easier: all you need to do is fit a rectangle around the contents of the user's form. Coordinate descent should work. You will need some tolerance for noise (junk outside the form).




回答2:


Here's a little summary of some available OCR solutions (open source and not): http://googlesystem.blogspot.com/2007/04/open-source-ocr-software-sponsored-by.html




回答3:


Rigid registration may not be enough. Users may modify the layout and formatting of a template form, such as change the fonts, change the location of a checkbox or an entry box, break a paragraph at different newline positions, etc. These differences are more complicated to deal with than the pure shift, rotation or scale transformation. Besides, if your image is binary image (black and white), I don't think those medical image registration algorithms (working on grayscale image) will help much. Your cost function and minimization strategies may be changed accordingly.



来源:https://stackoverflow.com/questions/548309/how-to-find-blank-field-on-scanned-document-image

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!