tess4j with Spring mvc

后端未结

关注

 2  1002

无人及你

I have tried tess4j as a standalone java program and it worked properly giving the text output.

Now i am trying to create a spring mvc web project adding the dependencie

相关标签:

2条回答

后悔当初

2021-01-23 15:09

Even I faced the similar problem of using tess4j for DynamicWebProject. But thanks to comment by @nguyenq that helped me I got it working. Mostly tess4j uses TIFF handler for optical recognition. The dependencies required for it are not available with default ImageIO. So, jai-imageio.jar is required. All I did was added line ImageIO.scanForPlugins() before I called the wrapper class that performed doOCR. I had following jars in my lib:-

tess4j.jar

jai_imageio.jar

ghost4j-0.3.1.jar

jna.jar

junit-4.10.jar

Here's the sample code:

TessractOCR tessocr = new TessractOCR();
        ImageIO.scanForPlugins();
        String extractedString = tessocr.extractTextFromImage(binarizrImage);

The function

public static String extractTextFromImage(BufferedImage image){
        RenderedImage img = image;

        String result =null;
        try {
            File outputfile = new File("saved.png");
       ImageIO.write(img, "png", outputfile);
        Tesseract instance = Tesseract.getInstance(); // JNA Interface Mapping
        instance.setDatapath("E:\\OCR-data\\Tess4J-1.2-src\\Tess4J");

        result = instance.doOCR(outputfile);


            System.out.println(result);

        } catch (Exception e) {
            System.err.println(e.getMessage());
        }
        return result;
    }

It works 100% :)

0 讨论(0)

我寻月下人不归

2021-01-23 15:10

Below is the working code sharing for all:

public static String doOCR(File pdfInvoice) {
        String result = "";
        long totalTime = 0;
        long endTime = 0;
        long startTime = System.currentTimeMillis();
        File imageFile = new File("D:\\docfolder\\9011121584.pdf");
        Tesseract instance = Tesseract.getInstance(); //

        try {

            ImageIO.scanForPlugins();
            result = instance.doOCR(imageFile);

            endTime = System.currentTimeMillis();
            totalTime = endTime - startTime;
            System.out.println("Total Time Taken For OCR: " + (totalTime / 1000));
            return result;
        } catch (Exception e) {
            System.err.println(e.getMessage());
            result = "";
            return result;
        }
    }

0 讨论(0)