问题
Given this image:
I'd like to make it such that it will rotate and stretch to fully fit in the bounding box with no whitespace on the outside of the largest rectangular box. It should also account for worse perspective case, like in the links I list later on.
Basically, while it is not noticeable, the rectangle is rotated a little bit, and I'd like to fix that distortion.
However, I got an error when attempting to retrieve the four points of the contour. I have made sure and utilized contour approximation to isolate and get only relevant looking contours and as you can see in the image it's successful, though I can't use perspective warp on it.
I've already tried the links here:
- How to straighten a rotated rectangle area of an image using opencv in python?
- https://www.pyimagesearch.com/2014/05/05/building-pokedex-python-opencv-perspective-warping-step-5-6/
- https://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/
And followed them, with only minor modifications (like not downscaling the image and then upscaling it) and different input image.
There is a similar error encountered by a reader there in the comments, but the author just said to use contour approximation. I did that but I still receive the same error.
I have already retrieved the contour (which along with its bounding box, is the image illustrated earlier), and used this code to attempt persective warp:
def warp_perspective(cnt):
# reshape cnt to get tl, tr, br, bl points
pts = cnt.reshape(4, 2)
rect = np.zeros((4, 2), dtype="float32")
s = pts.sum(axis=1)
rect[0] = pts[np.argmin(s)]
rect[2] = pts[np.argmin(s)]
diff = np.diff(pts, axis=1)
rect[1] = pts[np.argmin(diff)]
rect[2] = pts[np.argmax(diff)]
# solve for the width of the image
(tl, tr, br, bl) = rect
widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
# solve for the height of the image
heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
# get the final dimensions
maxWidth = max(int(widthA), int(widthB))
maxHeight = max(int(heightA), int(heightB))
# construct the dst image
dst = np.array([
[0, 0],
[maxWidth - 1, 0],
[maxWidth - 1, maxHeight - 1],
[0, maxHeight - 1]], dtype="float32")
# calculate perspective transform matrix
# warp the perspective
M = cv2.getPerspectiveTransform(rect, dst)
warp = cv2.warpPerspective(orig, M, (maxWidth, maxHeight))
cv2.imshow("warped", warp)
return warp
The function accepts cnt
as a single contour.
Upon running I ran into this error I mentioned earlier:
in warp_perspective
pts = cnt.reshape(4, 2)
ValueError: cannot reshape array of size 2090 into shape (4,2)
Which I do not understand at all. I have successfully isolated and retrieved the correct contour and bounding box, and the only thing I did differently was to skip downscaling..
回答1:
Try this approach:
- Convert image to grayscale and blur with bilateral filter
- Otsu's threshold
- Find contours
- Perform contour approximation for largest square contour
- Perspective transform and rotate
Result
import cv2
import numpy as np
import imutils
def perspective_transform(image, corners):
def order_corner_points(corners):
# Separate corners into individual points
# Index 0 - top-right
# 1 - top-left
# 2 - bottom-left
# 3 - bottom-right
corners = [(corner[0][0], corner[0][1]) for corner in corners]
top_r, top_l, bottom_l, bottom_r = corners[0], corners[1], corners[2], corners[3]
return (top_l, top_r, bottom_r, bottom_l)
# Order points in clockwise order
ordered_corners = order_corner_points(corners)
top_l, top_r, bottom_r, bottom_l = ordered_corners
# Determine width of new image which is the max distance between
# (bottom right and bottom left) or (top right and top left) x-coordinates
width_A = np.sqrt(((bottom_r[0] - bottom_l[0]) ** 2) + ((bottom_r[1] - bottom_l[1]) ** 2))
width_B = np.sqrt(((top_r[0] - top_l[0]) ** 2) + ((top_r[1] - top_l[1]) ** 2))
width = max(int(width_A), int(width_B))
# Determine height of new image which is the max distance between
# (top right and bottom right) or (top left and bottom left) y-coordinates
height_A = np.sqrt(((top_r[0] - bottom_r[0]) ** 2) + ((top_r[1] - bottom_r[1]) ** 2))
height_B = np.sqrt(((top_l[0] - bottom_l[0]) ** 2) + ((top_l[1] - bottom_l[1]) ** 2))
height = max(int(height_A), int(height_B))
# Construct new points to obtain top-down view of image in
# top_r, top_l, bottom_l, bottom_r order
dimensions = np.array([[0, 0], [width - 1, 0], [width - 1, height - 1],
[0, height - 1]], dtype = "float32")
# Convert to Numpy format
ordered_corners = np.array(ordered_corners, dtype="float32")
# Find perspective transform matrix
matrix = cv2.getPerspectiveTransform(ordered_corners, dimensions)
# Transform the image
transformed = cv2.warpPerspective(image, matrix, (width, height))
# Rotate and return the result
return imutils.rotate_bound(transformed, angle=-90)
image = cv2.imread('1.png')
original = image.copy()
blur = cv2.bilateralFilter(image,9,75,75)
gray = cv2.cvtColor(blur, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,0,255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.015 * peri, True)
if len(approx) == 4:
cv2.drawContours(image,[c], 0, (36,255,12), 3)
transformed = perspective_transform(original, approx)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.imshow('transformed', transformed)
cv2.waitKey()
来源:https://stackoverflow.com/questions/58422868/how-to-warp-a-rectangular-object-to-fit-its-larger-bounding-box