Tesseract training for a new font

后端 未结 3 1953
日久生厌
日久生厌 2021-01-31 10:17

I\'m still new to Tesseract OCR and after using it in my script noticed it had a relatively big error rate for the images I was trying to extract text from. I came across Tesser

3条回答
  •  既然无缘
    2021-01-31 11:05

    If you want to train tesseract with the new font, then generate .traineddata file with your desired font. For generating .traineddata, first you will need .tiff file and .box file. You can create these files using jTessBoxEditor. Tutorial for jBossTextEditor is here. While making .tiff file you can set the font in which you have train tesseract. Either you can jTessBoxEditor for generating .traineddata or serak-tesseract-trainer is also there. I have used both and I would say that for generating tiff and box files jTessBoxEditor is great and for training tesseract use serak.

提交回复
热议问题