tess4j with Spring mvc

后端 未结 2 1002
无人及你
无人及你 2021-01-23 14:58

I have tried tess4j as a standalone java program and it worked properly giving the text output.

Now i am trying to create a spring mvc web project adding the dependencie

相关标签:
2条回答
  • 2021-01-23 15:09

    Even I faced the similar problem of using tess4j for DynamicWebProject. But thanks to comment by @nguyenq that helped me I got it working. Mostly tess4j uses TIFF handler for optical recognition. The dependencies required for it are not available with default ImageIO. So, jai-imageio.jar is required. All I did was added line ImageIO.scanForPlugins() before I called the wrapper class that performed doOCR. I had following jars in my lib:-

    tess4j.jar

    jai_imageio.jar

    ghost4j-0.3.1.jar

    jna.jar

    junit-4.10.jar

    Here's the sample code:

    TessractOCR tessocr = new TessractOCR();
            ImageIO.scanForPlugins();
            String extractedString = tessocr.extractTextFromImage(binarizrImage);
    

    The function

    public static String extractTextFromImage(BufferedImage image){
            RenderedImage img = image;
    
            String result =null;
            try {
                File outputfile = new File("saved.png");
           ImageIO.write(img, "png", outputfile);
            Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
            instance.setDatapath("E:\\OCR-data\\Tess4J-1.2-src\\Tess4J");
    
            result = instance.doOCR(outputfile);
    
    
                System.out.println(result);
    
            } catch (Exception e) {
                System.err.println(e.getMessage());
            }
            return result;
        }
    

    It works 100% :)

    0 讨论(0)
  • 2021-01-23 15:10

    Below is the working code sharing for all:

    public static String doOCR(File pdfInvoice) {
            String result = "";
            long totalTime = 0;
            long endTime = 0;
            long startTime = System.currentTimeMillis();
            File imageFile = new File("D:\\docfolder\\9011121584.pdf");
            Tesseract instance = Tesseract.getInstance(); //
    
            try {
    
                ImageIO.scanForPlugins();
                result = instance.doOCR(imageFile);
    
                endTime = System.currentTimeMillis();
                totalTime = endTime - startTime;
                System.out.println("Total Time Taken For OCR: " + (totalTime / 1000));
                return result;
            } catch (Exception e) {
                System.err.println(e.getMessage());
                result = "";
                return result;
            }
        }
    
    0 讨论(0)
提交回复
热议问题