Recognize a number from an image

前端 未结 6 2001
-上瘾入骨i
-上瘾入骨i 2021-01-31 08:53

I\'m trying to write an application to find the numbers inside an image and add them up.

How can I identify the written number in an image?

6条回答
  •  情歌与酒
    2021-01-31 09:20

    You will most likely need to do the following:

    1. Apply the Hough Transform algorithm on the entire page, this should should yield a series of page sections.

    2. For each section you get, apply it again. If the current section yielded 2 elements, then you should be dealing with a rectangle similar to the above.

    3. Once that you are done, you can use an OCR to extract the numeric value.

    In this case, I would recommend you take a look at JavaCV (OpenCV Java Wrapper) which should allow you to tackle the Hough Transform part. You would then need something akin to Tess4j (Tesseract Java Wrapper) which should allow you to extract the numbers you are after.

    As an extra note, to reduce the amount of false positives, you might want to do the following:

    1. Crop the image if you are certain that certain coordinates will never contain data you are after. This should give you a smaller picture to work with.

    2. It might be wise to change the image to grey scale (assuming you are working with a colour image). Colours can have a negative impact on the OCR's ability to resolve the image.

    EDIT: As per your comment, given something like this:

    +------------------------------+
    |                   +---+---+  |
    |                   |   |   |  |
    |                   +---+---+  |
    |                   +---+---+  |
    |                   |   |   |  |
    |                   +---+---+  |
    |                   +---+---+  |
    |                   |   |   |  |
    |                   +---+---+  |
    |                   +---+---+  |
    |                   |   |   |  |
    |                   +---+---+  |
    +------------------------------+
    

    You would crop the image so that your remove the area which does not have relevant data (the part on the left) by cropping the image, you would get something like so:

    +-------------+
    |+---+---+    |
    ||   |   |    | 
    |+---+---+    |
    |+---+---+    |
    ||   |   |    |
    |+---+---+    |
    |+---+---+    |
    ||   |   |    |
    |+---+---+    |
    |+---+---+    |
    ||   |   |    |
    |+---+---+    |
    +-------------+
    

    The idea would be to run the Hough Transform so that you can get segments of the page which contain rectangles like so:

    +---+---+    
    |   |   |     
    +---+---+ 
    

    Which you would then apply the Hough Transform again and end up with two segments, and you take the left one.

    Once that you have the left segment, you would then apply the OCR.

    You can try to apply the OCR before hand, but at best, the OCR will recognize both numeric values, both written and both typed, which from what I get, is not what you are after.

    Also, the extra lines which depict the rectangles might throw the OCR off track, and make it yield bad results.

提交回复
热议问题