问题
I'm trying to train Tesseract in Windows and for that I need a pair tiff/box file and I'm trying to create it using jTessBoxEditor but it doesn't accept images as input. I've also tried boxFactory but it doesn't run properly. Does anyone know what is the best tool to create the pair from images?
Thanks
回答1:
If you have jTessBoxEditor, then you have Tesseract bin files. Go to the tesseract-ocr subfolder of jTessBoxEditor and run the following command :
tesseract.exe D:\testocr\TestImage.tif D:\testocr\TestImage batch.nochop makebox
It should generate the file D:\testocr\TestImage.box. Then in jTessBoxEditor, go to Box Editor tab and open your image. The box file is automatically loaded, you can check if everything is ok and correct possible mistakes.
回答2:
I had this same kind of problem with being unable to properly open images with jTessBoxEditor in order to work with their boxes. I realized that one essential component is that the name of the .tif
image and the name of the .box
file must be identical, except for the different extensions. Without this, jTessBoxEditor won't be able to know which box file goes with which image. Thus, using the syntax suggested by darkpotpot above, then making sure the two file names match like indicated, then clicking on the "open" button in the Box Editor tab of jTessBoxEditor should work.
来源:https://stackoverflow.com/questions/31751402/how-to-generate-a-tiff-box-file-from-an-image-to-train-tesseract-in-windows