“Adding” new fonts to Tesseract eng.traineddata

后端 未结 2 1183
既然无缘
既然无缘 2021-01-31 11:41

As far as I know, Tesseract 3.x comes with 6 English (correct me if I\'m wrong) fonts. I need to train Tesseract for more 5 types of fonts. I need only capital letters and digit

相关标签:
2条回答
  • 2021-01-31 12:06

    Should use a different name, e.g., eng1.traineddata. That way you can use the new data with the original one by specifying the language option -l eng+eng1.

    0 讨论(0)
  • 2021-01-31 12:30

    If you have new trained data with different font, I think you don't have dictionary correction for your new font.

    To add new trained data you can do this (I'm using PHP code here)

    //  as you new trained data, it must be 3 letter prefix 
    // what ever 3 letter you want
    $languange = "eng+deu";
    $settingLanguage = $tesseract -> setLanguage($language) ; 
    

    By seeing the tesseract.php function setLanguage(), you can set the language by that function.

    0 讨论(0)
提交回复
热议问题