Scipy rotate and zoom an image without changing its dimensions

前端 未结 2 1338
夕颜
夕颜 2021-02-01 21:44

For my neural network I want to augment my training data by adding small random rotations and zooms to my images. The issue I am having is that scipy is changing the size of my

相关标签:
2条回答
  • 2021-02-01 22:17

    scipy.ndimage.rotate accepts a reshape= parameter:

    reshape : bool, optional

    If reshape is true, the output shape is adapted so that the input array is contained completely in the output. Default is True.

    So to "clip" the edges you can simply call scipy.ndimage.rotate(img, ..., reshape=False).

    from scipy.ndimage import rotate
    from scipy.misc import face
    from matplotlib import pyplot as plt
    
    img = face()
    rot = rotate(img, 30, reshape=False)
    
    fig, ax = plt.subplots(1, 2)
    ax[0].imshow(img)
    ax[1].imshow(rot)
    

    Things are more complicated for scipy.ndimage.zoom.

    A naive method would be to zoom the entire input array, then use slice indexing and/or zero-padding to make the output the same size as your input. However, in cases where you're increasing the size of the image it's wasteful to interpolate pixels that are only going to get clipped off at the edges anyway.

    Instead you could index only the part of the input that will fall within the bounds of the output array before you apply zoom:

    import numpy as np
    from scipy.ndimage import zoom
    
    
    def clipped_zoom(img, zoom_factor, **kwargs):
    
        h, w = img.shape[:2]
    
        # For multichannel images we don't want to apply the zoom factor to the RGB
        # dimension, so instead we create a tuple of zoom factors, one per array
        # dimension, with 1's for any trailing dimensions after the width and height.
        zoom_tuple = (zoom_factor,) * 2 + (1,) * (img.ndim - 2)
    
        # Zooming out
        if zoom_factor < 1:
    
            # Bounding box of the zoomed-out image within the output array
            zh = int(np.round(h * zoom_factor))
            zw = int(np.round(w * zoom_factor))
            top = (h - zh) // 2
            left = (w - zw) // 2
    
            # Zero-padding
            out = np.zeros_like(img)
            out[top:top+zh, left:left+zw] = zoom(img, zoom_tuple, **kwargs)
    
        # Zooming in
        elif zoom_factor > 1:
    
            # Bounding box of the zoomed-in region within the input array
            zh = int(np.round(h / zoom_factor))
            zw = int(np.round(w / zoom_factor))
            top = (h - zh) // 2
            left = (w - zw) // 2
    
            out = zoom(img[top:top+zh, left:left+zw], zoom_tuple, **kwargs)
    
            # `out` might still be slightly larger than `img` due to rounding, so
            # trim off any extra pixels at the edges
            trim_top = ((out.shape[0] - h) // 2)
            trim_left = ((out.shape[1] - w) // 2)
            out = out[trim_top:trim_top+h, trim_left:trim_left+w]
    
        # If zoom_factor == 1, just return the input array
        else:
            out = img
        return out
    

    For example:

    zm1 = clipped_zoom(img, 0.5)
    zm2 = clipped_zoom(img, 1.5)
    
    fig, ax = plt.subplots(1, 3)
    ax[0].imshow(img)
    ax[1].imshow(zm1)
    ax[2].imshow(zm2)
    

    0 讨论(0)
  • 2021-02-01 22:39

    I recommend using cv2.resize because it is way faster than scipy.ndimage.zoom, probably due to support for simpler interpolation methods.

    For a 480x640 image :

    • cv2.resize takes ~2 ms
    • scipy.ndimage.zoom takes ~500 ms
    • scipy.ndimage.zoom(...,order=0) takes ~175ms

    If you are doing the data augmentation on the fly, this amount of speedup is invaluable because it means more experiments in less time.

    Here is a version of clipped_zoom using cv2.resize

    def cv2_clipped_zoom(img, zoom_factor):
        """
        Center zoom in/out of the given image and returning an enlarged/shrinked view of 
        the image without changing dimensions
        Args:
            img : Image array
            zoom_factor : amount of zoom as a ratio (0 to Inf)
        """
        height, width = img.shape[:2] # It's also the final desired shape
        new_height, new_width = int(height * zoom_factor), int(width * zoom_factor)
    
        ### Crop only the part that will remain in the result (more efficient)
        # Centered bbox of the final desired size in resized (larger/smaller) image coordinates
        y1, x1 = max(0, new_height - height) // 2, max(0, new_width - width) // 2
        y2, x2 = y1 + height, x1 + width
        bbox = np.array([y1,x1,y2,x2])
        # Map back to original image coordinates
        bbox = (bbox / zoom_factor).astype(np.int)
        y1, x1, y2, x2 = bbox
        cropped_img = img[y1:y2, x1:x2]
    
        # Handle padding when downscaling
        resize_height, resize_width = min(new_height, height), min(new_width, width)
        pad_height1, pad_width1 = (height - resize_height) // 2, (width - resize_width) //2
        pad_height2, pad_width2 = (height - resize_height) - pad_height1, (width - resize_width) - pad_width1
        pad_spec = [(pad_height1, pad_height2), (pad_width1, pad_width2)] + [(0,0)] * (img.ndim - 2)
    
        result = cv2.resize(cropped_img, (resize_width, resize_height))
        result = np.pad(result, pad_spec, mode='constant')
        assert result.shape[0] == height and result.shape[1] == width
        return result
    
    0 讨论(0)
提交回复
热议问题