问题
my target is to remove any non text area from a scanned business card image but i don't know the steps to perform that using OpenCV , i have followed this steps but don't know this is the right one or not also i don't notice any change in the image (non text areas remains) any idea will be very helpful,thanks.
1) convert the image to grayscale
2) binaries the image
3) invert the colors (cv::bitwise_not) in order to have a white pixels text
4) Erode the image(cv::erode)
5) detect edges using canny
6) detect text lines using hough transform (not yet)
code:
cv::Mat greyMat = [self.imageView.image CVGrayscaleMat];
cv::Mat bwMat;
cv::threshold(greyMat, bwMat, 128, 255, CV_THRESH_BINARY);
cv::bitwise_not(bwMat, bwMat);
cv::Mat element = cv::getStructuringElement(cv::MORPH_RECT, cv::Size(2, 1));
cv::erode(bwMat, bwMat, element);
std::vector<cv::Point>points;
cv::Mat_<uchar>::iterator it=bwMat.begin<uchar>();
cv::Mat_<uchar>::iterator end=bwMat.end<uchar>();
for (; it!=end; ++it)
if (*it)
points.push_back(it.pos());
cv::RotatedRect box=cv::minAreaRect(cv::Mat(points));
plImage* smothedImage=new IplImage(bwMat);
cvSmooth(smothedImage, smothedImage);
cvCanny(smothedImage, smothedImage, 10, 100);
回答1:
one way to define the logo could be that the logo parts contain fatter lines and larger blobs than the text parts.
If this holds true, you can remove the logo like this:
(I used imagemagick to generate the following samples, you should be able to to the same with opencv)
0) threshold the image (optional)
imagemagick: convert img0052ir.jpg -threshold 60% monochrome.png
1) dilate the image until all letters are gone, but some parts of the logo still remain.
orgImg = business card image
cvDilate (orgImg, curImg, null, Some_Value) // find a value that erases all letters but leaves the logo
or
imagemagick: convert monochrome.png -morphology dilate:3 diamond dilated.png
2) conditional-erode the dilated image, using the original image as the mask, until the logo is complete again.
This means you erode the dilated image, but never set a pixel value to below its value in the original source image. You use the original image as a mask to protect parts of the image from changes.
this will restore all shapes that still have some seed part left, so only the logo
// Not sure if opencv can use a mask in morphologial operations,
// but you can just use erode + max in a loop to achive the same effect albeit slower
repeat
lastImg = curImg
cvErode (lastImg, curImg, null ,1)
cvMax (curImg, orgImg, curImg)
until lastImg == curImg
or
imagemagick: convert dilated.png -morphology erode:20 diamond -clip-mask monochrome.png eroded.png
you now have an image with ONLY the logo and no text, use this image to remove the logo
imagemagick: convert eroded.png -negate img0052ir.jpg -compose plus -composite test.png
As you can see, it is far from perfect. Perhaps toy around with the thresholding, dilation kernel etc, in opencv to see if this can be improved. But I doubt there is a general solution that removes all logos and none of the text
EDIT: some pictures added
来源:https://stackoverflow.com/questions/10255013/how-to-use-opencv-to-remove-non-text-areas-from-a-business-card