I dont know whether should I post this question here or not? But if someone knows it, please answer?
What are the algorithms for determining which region in an image is text and which one is graphic? Means how to separate such regions? (figure or diagram)
Most OCR software, e.g., Ocropus, support layout analysis, which is what you need.
Mao, Rosenfeld & Kanungo (2003) Document structure analysis algorithms: a literature survey provides a fairly recent survey of layout analysis algorithms.
first step would probably be to isolate the sharper contrast between text and image. This can be done by taking the derivative of the image. This will show the change in color and the high values would most likely then be compared to textual shapes
来源:https://stackoverflow.com/questions/2903176/determining-which-are-the-text-and-graphic-regions-in-an-image