问题
I'm looking for a good panorama stitching library for text. I tried OpenCV and OpenPano. They both work good on regular photos, but fail on text. For example I need to stitch the following 3 images:
The images have about 45% overlapping between each other.
If there's an option to make one of the mentioned libraries work good on text images, instead of finding another library, that would be great.
- I need the library to work on linux arm.
回答1:
OpenPano fails at stitching text because it cannot retrieve enough feature points (or keypoints) to do the stitching process.
Text stitching doesn't need a matching method that is robust to rotations but only to translations. OpenCV conveniently offers such a function. It is called : Template Matching.
The solution I will develop is based on this OpenCV's feature.
Pipeline
I will now explain the main steps of my solution (for further details, please have a look at the code provided bellow).
Matching process
In order to match two consecutive images (done in the matchImages
function, see code bellow):
- We create a template image by taking 45% (
H_templ_ratio
) of the first image as depicted bellow:
This step is done in my code by the function genTemplate
.
- We add black margins to the second image (where we want to find the template). This step is necessary if the text is not aligned in the input images (this is the case on these sample images though). Here is what the image looks like after the margin process. As you can see, the margins are only needed bellow and above the image:
The template image could theoretically be found anywhere in this margined image. This process is done in the addBlackMargins
function.
- We apply a canny filter on both the template image and the image where we want to find it (done inside the
Mat2Edges
function). This will add robustness to the matching process. Here is an example:
- We match the template with the image using matchTemplate and we retrieve the best match location with the minMaxLoc function.
Calculating final image size
This step consists in calculating the size of the final matrix where we will stitch all the images together. This is particularly needed if all the input images don't have the same height.
This step is done inside the calcFinalImgSize
function. I won't get into to much details here because even though it looks a bit complex (for me at least), this is only simple maths (additions, subtractions, multiplications). Take a pen and paper if you want to understand the formulas.
Stitching process
Once we have the match locations for each input images, we only have to do simple maths to copy the input images in the right spot of the final image. Again, I recommend you to check the code for implementation details (see stitchImages
function).
Results
Here is the result with your input images:
As you can see, the result is not "pixel perfect" but it should be good enough for OCR.
And here is another result with input images of different heights:
Code (Python)
My program is written in Python and uses cv2
(OpenCV) and numpy
modules. However it can be ported (easily) in other languages such as C++, Java and C#.
import numpy as np
import cv2
def genTemplate(img):
global H_templ_ratio
# we get the image's width and height
h, w = img.shape[:2]
# we compute the template's bounds
x1 = int(float(w)*(1-H_templ_ratio))
y1 = 0
x2 = w
y2 = h
return(img[y1:y2,x1:x2]) # and crop the input image
def mat2Edges(img): # applies a Canny filter to get the edges
edged = cv2.Canny(img, 100, 200)
return(edged)
def addBlackMargins(img, top, bottom, left, right): # top, bottom, left, right: margins width in pixels
h, w = img.shape[:2]
result = np.zeros((h+top+bottom, w+left+right, 3), np.uint8)
result[top:top+h,left:left+w] = img
return(result)
# return the y_offset of the first image to stitch and the final image size needed
def calcFinalImgSize(imgs, loc):
global V_templ_ratio, H_templ_ratio
y_offset = 0
max_margin_top = 0; max_margin_bottom = 0 # maximum margins that will be needed above and bellow the first image in order to stitch all the images into one mat
current_margin_top = 0; current_margin_bottom = 0
h_init, w_init = imgs[0].shape[:2]
w_final = w_init
for i in range(0,len(loc)):
h, w = imgs[i].shape[:2]
h2, w2 = imgs[i+1].shape[:2]
# we compute the max top/bottom margins that will be needed (relatively to the first input image) in order to stitch all the images
current_margin_top += loc[i][1] # here, we assume that the template top-left corner Y-coordinate is 0 (relatively to its original image)
current_margin_bottom += (h2 - loc[i][1]) - h
if(current_margin_top > max_margin_top): max_margin_top = current_margin_top
if(current_margin_bottom > max_margin_bottom): max_margin_bottom = current_margin_bottom
# we compute the width needed for the final result
x_templ = int(float(w)*H_templ_ratio) # x-coordinate of the template relatively to its original image
w_final += (w2 - x_templ - loc[i][0]) # width needed to stitch all the images into one mat
h_final = h_init + max_margin_top + max_margin_bottom
return (max_margin_top, h_final, w_final)
# match each input image with its following image (1->2, 2->3)
def matchImages(imgs, templates_loc):
for i in range(0,len(imgs)-1):
template = genTemplate(imgs[i])
template = mat2Edges(template)
h_templ, w_templ = template.shape[:2]
# Apply template Matching
margin_top = margin_bottom = h_templ; margin_left = margin_right = 0
img = addBlackMargins(imgs[i+1],margin_top, margin_bottom, margin_left, margin_right) # we need to enlarge the input image prior to call matchTemplate (template needs to be strictly smaller than the input image)
img = mat2Edges(img)
res = cv2.matchTemplate(img,template,cv2.TM_CCOEFF) # matching function
_, _, _, templ_pos = cv2.minMaxLoc(res) # minMaxLoc gets the best match position
# as we added margins to the input image we need to subtract the margins width to get the template position relatively to the initial input image (without the black margins)
rectified_templ_pos = (templ_pos[0]-margin_left, templ_pos[1]-margin_top)
templates_loc.append(rectified_templ_pos)
print("max_loc", rectified_templ_pos)
def stitchImages(imgs, templates_loc):
y_offset, h_final, w_final = calcFinalImgSize(imgs, templates_loc) # we calculate the "surface" needed to stitch all the images into one mat (and y_offset, the Y offset of the first image to be stitched)
result = np.zeros((h_final, w_final, 3), np.uint8)
#initial stitch
h_init, w_init = imgs[0].shape[:2]
result[y_offset:y_offset+h_init, 0:w_init] = imgs[0]
origin = (y_offset, 0) # top-left corner of the last stitched image (y,x)
# stitching loop
for j in range(0,len(templates_loc)):
h, w = imgs[j].shape[:2]
h2, w2 = imgs[j+1].shape[:2]
# we compute the coordinates where to stitch imgs[j+1]
y1 = origin[0] - templates_loc[j][1]
y2 = origin[0] - templates_loc[j][1] + h2
x_templ = int(float(w)*(1-H_templ_ratio)) # x-coordinate of the template relatively to its original image's right side
x1 = origin[1] + x_templ - templates_loc[j][0]
x2 = origin[1] + x_templ - templates_loc[j][0] + w2
result[y1:y2, x1:x2] = imgs[j+1] # we copy the input image into the result mat
origin = (y1,x1) # we update the origin point with the last stitched image
return(result)
if __name__ == '__main__':
# input images
part1 = cv2.imread('part1.jpg')
part2 = cv2.imread('part2.jpg')
part3 = cv2.imread('part3.jpg')
imgs = [part1, part2, part3]
H_templ_ratio = 0.45 # H_templ_ratio: horizontal ratio of the input that we will keep to create a template
templates_loc = [] # templates location
matchImages(imgs, templates_loc)
result = stitchImages(imgs, templates_loc)
cv2.imshow("result", result)
回答2:
OpenCV 3 has a Stitcher class which can perform stitching on Text as well as photos.
import cv2
imageFiles = [YOUR IMAGE FILE NAMES]
images = []
for filename in imagefiles:
img = cv2.imread(filename)
images.append(img)
stitcher = cv2.createStitcher()
status, result = stitcher.stitch(images)
I got this result using your images.
来源:https://stackoverflow.com/questions/45612933/panorama-stitching-for-text