How to detect subscript numbers in an image using OCR?

前端未结

关注

 3  1514

抹茶落季 2021-02-14 11:46

I am using tesseract for OCR, via the pytesseract bindings. Unfortunately, I encounter difficulties when trying to extract text including subscript-sty

3条回答

误落风尘 (楼主)

2021-02-14 12:45
This is because the font of subscript is too small. You could resize the image using a python package such as cv2 or PIL and use the resized image for OCR as coded below.
```
import pytesseract
import cv2

img = cv2.imread('test.jpg')
img = cv2.resize(img, None, fx=2, fy=2)  # scaling factor = 2

data = pytesseract.image_to_string(img)
print(data)
```
OUTPUT:
```
CH3
```
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...