image processing to improve tesseract OCR accuracy

前端 未结 13 1657
鱼传尺愫
鱼传尺愫 2020-11-22 14:41

I\'ve been using tesseract to convert documents into text. The quality of the documents ranges wildly, and I\'m looking for tips on what sort of image processing might impr

13条回答
  •  清酒与你
    2020-11-22 15:13

    Adaptive thresholding is important if the lighting is uneven across the image. My preprocessing using GraphicsMagic is mentioned in this post: https://groups.google.com/forum/#!topic/tesseract-ocr/jONGSChLRv4

    GraphicsMagic also has the -lat feature for Linear time Adaptive Threshold which I will try soon.

    Another method of thresholding using OpenCV is described here: http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_thresholding/py_thresholding.html

提交回复
热议问题