Tess4j on Windows 64-bit: exception on multiple threads

两盒软妹~` 提交于 2019-12-04 14:41:45

Tesseract on its own can only convert images to text, and not PDFs, even if the PDFs are scanned.

Under the hood, Tess4j uses Ghostscript (through ghost4j) to convert each page to a single image file, which it then feeds to Tesseract for OCR. It concatenates the resulting strings into a single string, which it returns.

The reason for the exception is that Tess4j uses Ghost4j in a way that does not support multithreading. As described here, ghost4j does provide multithreading support from its high-level API (actually it runs different instances of Ghostscript separately each invoked from a different JVM). Tess4j, however, uses its low-level API, where a single Ghostscript instance may be used.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!