Extracting lines from an image to feed to OCR - Tesseract

后端 未结 3 1524
闹比i
闹比i 2021-02-10 20:57

I was watching this talk from pycon http://youtu.be/B1d9dpqBDVA?t=15m34s around the 15:33 mark the speaker talks about extracting lines from an image (receipt) and then feeding

3条回答
  •  孤城傲影
    2021-02-10 21:42

    Take a look at the technique used to detect the skew angle of a text.

    Groups are lines are used to isolate text on an image (this is the interesting part).

    From this result you can easily detect the upper/lower limits of each line of text. The text itself will be located inside them. I've faced a similar problem before, the code might be useful to you:

    All you need to do from here is crop each pair of lines and feed that as an image to Tesseract.

提交回复
热议问题