问题
I have a set of images that represent letters extracted from an image of a word. In some images there are remains of the adjacent letters and I want to eliminate them but I do not know how.
Some samples
I'm working with openCV and I've tried two ways and none works.
With findContours:
def is_contour_bad(c):
return len(c) < 50
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
edged = cv2.Canny(gray, 50, 100)
contours = cv2.findContours(edged.copy(), cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if imutils.is_cv2() else contours[1]
mask = np.ones(image.shape[:2], dtype="uint8") * 255
for c in contours:
# if the c ontour is bad, draw it on the mask
if is_contour_bad(c):
cv2.drawContours(mask, [c], -1, 0, -1)
# remove the contours from the image and show the resulting images
image = cv2.bitwise_and(image, image, mask=mask)
cv2.imshow("After", image)
cv2.waitKey(0)
I think it does not work because the image is on the edge cv2.drawContours can not calculate the area correctly and does not eliminate the interior points
With connectedComponentsWithStats:
cv2.imshow("Image", img)
cv2.waitKey(0)
nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(img)
sizes = stats[1:, -1];
nb_components = nb_components - 1
min_size = 150
img2 = np.zeros((output.shape))
for i in range(0, nb_components):
if sizes[i] >= min_size:
img2[output == i + 1] = 255
cv2.imshow("After", img2)
cv2.waitKey(0)
In this case I do not know why the small elements on the sides do not recognize them as connected components
Well..I would greatly appreciate any help!
回答1:
In the very beginning of the question you have mentioned that letters have been extracted from an image of a word.
So as I think, You could have done the extraction correctly. Then you wouldn't have faced a problem like this. I can give you a solution which is applicable to either extracting letters from original image or extract and separate letters from the image you have given.
Solution:
You can use convex hull
coordinates to separate characters like this.
code:
import cv2
import numpy as np
img = cv2.imread('test.png', 0)
cv2.bitwise_not(img,img)
img2 = img.copy()
ret, threshed_img = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
image, contours, hier = cv2.findContours(threshed_img, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE)
#--- Black image to be used to draw individual convex hull ---
black = np.zeros_like(img)
contours = sorted(contours, key=lambda ctr: cv2.boundingRect(ctr)[0])
for cnt in contours:
hull = cv2.convexHull(cnt)
img3 = img.copy()
black2 = black.copy()
#--- Here is where I am filling the contour after finding the convex hull ---
cv2.drawContours(black2, [hull], -1, (255, 255, 255), -1)
r, t2 = cv2.threshold(black2, 127, 255, cv2.THRESH_BINARY)
masked = cv2.bitwise_and(img2, img2, mask = t2)
cv2.imshow("masked.jpg", masked)
cv2.waitKey(0)
cv2.destroyAllWindows()
outputs:
So as I suggest, the better thing is to use this solution when you extract characters from original image rather than removing noises after extraction.
回答2:
I would try the following:
- Sum along the columns so that every image gets projected into a vector
- Assuming that white=0 and black=1, find the first index value in that vector that = 0.
- Remove the image columns to the left of the index value from step 2.
- Reverse the summed vector from step 1
- Find the first index value that =0 in the reversed vector from step four.
- Remove the image columns to the right of the reversed index value from step 5.
This would work nicely for a binary image where white = 0 and black = 1 but if not, there are several methods around this including image threshholding or setting tolerance levels (e.g. for step 2. find first index value in vector that > tolerance...)
来源:https://stackoverflow.com/questions/53504738/remove-remains-in-a-letter-image-with-python