问题
I have this image
How to OCR it? I know this is very challenging, but I would really appreciate any help.
回答1:
If you have the time to develop the detection yourself, I would do it roughly like this:
- Get 1000 images or so and either OCR them yourself or let the people on Amazon Mechanical Turk do it for you, it will cost virtually nothing. Now you have something to tune your algorithm on and measure how well you are doing.
- Like Ryan wrote, play with standard image filters, contrast, color, gauss, etc, manually or with something like http://www.roborealm.com/ . See if you can't find a combination that makes the text really stand out.
- Try the libraries again
- If the libs still don't work, try to use your knowledge of the picture to split it into separate digits. You know how many digits there should be and roughly how many pixels each should take. Use edge detection or something (perhaps standard OCR feature extraction, together with clustering will give you each digit as a cluster?) to find the digits and split them out separately.
- Do standard OCR feature extraction (don't be too creative - use existing libraries or at least read up on what the most common and simple are) on each digit and feed those features, together with the answer you got under 1) into a neural network or a SVM.
- Improve your feature set until the machine learning works.
Since you have only ten digits, which are fairly consistent between images, this should work.
回答2:
I would suggest 2 libraries to get you going:
- Tesseract
- Emgu CV - comes with loads of examples look for the license plate detection one as a good place to start.
回答3:
Try playing with the contrast and gamma on the image. All you need is a solid outline on the characters for most libraries. Depending on your performance SLA, you could run through various contrast/gamma scenarios and let the OCR software take a couple of shots at it. Take an aggregate of the results and see if there are any consistencies. This could give you a fairly accurate result long term.
回答4:
ML (neutral network) for digits is usually accurate already with minimal training and easy to use. The ordering might be handled by OCR-ing with a "moving window" or like you crop out a tiny part of the width. Output might be ???1160060060??1??? for the first character, and you pick the most commonly appearing (0) while iterating over the width of the image. Maybe teach to your neutral net to also recognize space between figure, and your good. Clear ten-partitioning otherwise requires first automatically cropping. But all in all, very similar to the kind of task you'd get at university AI beginners courses.
来源:https://stackoverflow.com/questions/13630114/how-to-ocr-engraved-text