Forcing Tesseract to match pattern (four digits in a row)

≡放荡痞女 提交于 2019-12-06 04:38:37

You have not configured this correctly.

user_patterns_suffix is meant to indicate the file extension of a text file that contains your patterns, e.g.

user_patterns_suffix pats

would mean you need to put a file in the tesseract tessdata folder

tessdata/eng.pats

... assuming eng was the language you were using.

See more here:

http://tesseract-ocr.googlecode.com/svn/trunk/doc/tesseract.1.html#_config_files_and_augmenting_with_user_data

I do recall that user patterns may not be any shorter than 6 fixed chars before a pattern so you may not be able to accomplish this in any case - but try the correct config first.

They look like init-only parameters; as such, they need to be in a configs file, for instance, named bazaar placed under configs folder, to be be passed into setConfigs method.

instance.setConfigs(Arrays.asList("bazaar");

References:
https://github.com/tesseract-ocr/tesseract/blob/master/doc/tesseract.1.asc
https://github.com/tesseract-ocr/tesseract/wiki/ControlParams
http://tess4j.sourceforge.net/docs/docs-1.4/

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!