Tesseract and tiff format - spp not in set {1,3}

前端 未结 4 1117
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-06 21:32

While trying to run this command:

tesseract bond111.tif bond111 batch.nochop makebox

I get the next error

Error in pixReadFromT         


        
4条回答
  •  执念已碎
    2021-02-06 21:51

    Thanks for your post ZakW, you pointed me to the right direction. Anyhow i also needed to set '-depth 8'. Quality was not good enough for OCR, whatever I tried.

    What worked for me is this solution:

    ghostscript -o document.tiff -sDEVICE=tiffgray -r720x720 -g6120x7920 -sCompression=lzw document.pdf
    tesseract document.tiff document -l deu
    vim document.txt
    

    This way I got perfect text with Umlauts in german.

提交回复
热议问题