问题
I am using a Text Detection called CRAFT (you can check it out in github) which does a good job on major images I have used, but I have noticed that the text detection is very sensitive to lighting conditions.
To ilustrate this, see this image:
Text detected with CRAFT
I am interested in detecting the code part, which is: FBIU0301487. However, it seems that the caracter 'F' cannot be detected even using a threshold equals to zero, i.e. let every bounding box be consired as a valid detection. After seeing more cases like this, I think this happens due to the lighting conditions, i.e areas where there is a high contrast between the parts with shadow and the parts which reflect the light. What's more, this happens in the '22G1' text box as well with the first caracter '2'.
My question is: is there anything I can do try to detect these parts? Maybe I can go with some image preprocessing but I am not pretty sure what kind of preprocessing could help me. I am working in Python but I don't mind the library (OpenCV, scikit-image, ...).
I'd appreciate any idea. Thanks in advance!
回答1:
One possibility would be to do division normalization in Python/OpenCV.
Input:
import cv2
import numpy as np
# read the image
img = cv2.imread('truck_id.jpg')
# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# blur
smooth = cv2.GaussianBlur(gray, (33,33), 0)
# divide smooth by gray image
division = cv2.divide(smooth, gray, scale=255)
# invert
division = 255 - division
# save results
cv2.imwrite('truck_id_division.jpg',division)
# show results
cv2.imshow('smooth', smooth)
cv2.imshow('division', division)
cv2.waitKey(0)
cv2.destroyAllWindows()
Division Result:
Division Result (full resolution subsection):
回答2:
You can "Local adaptive threshold" the image before text detection algo.
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('dave.jpg',0)
img = cv2.medianBlur(img,5)
ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
cv2.THRESH_BINARY,11,2)
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv2.THRESH_BINARY,11,2)
titles = ['Original Image', 'Global Thresholding (v = 127)',
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
images = [img, th1, th2, th3]
for i in xrange(4):
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
plt.title(titles[i])
plt.xticks([]),plt.yticks([])
plt.show()
来源:https://stackoverflow.com/questions/64663469/manage-a-text-detector-which-is-very-sensitive-to-lighting-conditions