The Tesseract OCR engine isn't able to read the text from an auto generated image, but can from a CUT in MS Paint

后端未结

关注

 1  840

I\'m using a .NET wrapper for the Tesseract OCR engine. I have a large document that is a PNG. When I cut out a section of image in MS paint and then feed it into the engine, it

相关标签:

1条回答

旧时难觅i

2021-01-21 14:21

The default resolution of a new Bitmap is 96 DPI, which is not adequate for OCR purpose. Try to increase to 300 DPI, such as:

bmp.SetResolution(300, 300);

Update 1: When you scale the image, its dimension should change as well. Here's an example rescale function:

public static Image Rescale(Image image, int dpiX, int dpiY)
{
    Bitmap bm = new Bitmap((int)(image.Width * dpiX / image.HorizontalResolution), (int)(image.Height * dpiY / image.VerticalResolution));
    bm.SetResolution(dpiX, dpiY);
    Graphics g = Graphics.FromImage(bm);
    g.InterpolationMode = InterpolationMode.Bicubic;
    g.PixelOffsetMode = PixelOffsetMode.HighQuality;
    g.DrawImage(image, 0, 0);
    g.Dispose();

    return bm;
}

0 讨论(0)