Merge images using “VNImageHomographicAlignmentObservation” class

前端 未结 1 1253
情话喂你
情话喂你 2021-01-01 07:33

I am trying to merge two images using VNImageHomographicAlignmentObservation, I am currently getting a 3d matrix that looks like this:

simd_flo         


        
相关标签:
1条回答
  • 2021-01-01 07:49

    This homography matrix H describes how to project one of your images onto the image plane of the other image. To transform each pixel to its projected location, you can to compute its projected location x' = H * x using homogeneous coordinates (basically take your 2D image coordinate, add a 1.0 as third component, apply the matrix H, and go back to 2D by dividing through the 3rd component of the result).

    The most efficient way to do this for every pixel, is to write this matrix multiplication in homogeneous space using CoreImage. CoreImage offers multiple shader kernel types: CIColorKernel, CIWarpKernel and CIKernel. For this task, we only want to transform the location of each pixel, so a CIWarpKernel is what you need. Using the Core Image Shading Language, that would look as follows:

    import CoreImage
    let warpKernel = CIWarpKernel(source:
        """
        kernel vec2 warp(mat3 homography)
        {
            vec3 homogen_in = vec3(destCoord().x, destCoord().y, 1.0); // create homogeneous coord
            vec3 homogen_out = homography * homogen_in; // transform by homography
            return homogen_out.xy / homogen_out.z; // back to normal 2D coordinate
        }
        """
    )
    

    Note that the shader wants a mat3 called homography, which is the shading language equivalent of the simd_float3x3 matrix H. When calling the shader, the matrix is expected to be stored in a CIVector, to transform it use:

    let (col0, col1, col2) = yourHomography.columns
    let homographyCIVector = CIVector(values:[CGFloat(col0.x), CGFloat(col0.y), CGFloat(col0.z),
                                                 CGFloat(col1.x), CGFloat(col1.y), CGFloat(col1.z),
                                                 CGFloat(col2.x), CGFloat(col2.y), CGFloat(col2.z)], count: 9)
    

    When you apply the CIWarpKernel to an image, you have to tell CoreImage how big the output should be. To merge the warped and reference image, the output should be big enough to cover the whole projected and original image. We can compute the size of the projected image by applying the homography to each corner of the image rect (this time in Swift, CoreImage calls this rect the extent):

    /**
     * Convert a 2D point to a homogeneous coordinate, transform by the provided homography,
     * and convert back to a non-homogeneous 2D point.
     */
    func transform(_ point:CGPoint, by homography:matrix_float3x3) -> CGPoint
    {
      let inputPoint = float3(Float(point.x), Float(point.y), 1.0)
      var outputPoint = homography * inputPoint
      outputPoint /= outputPoint.z
      return CGPoint(x:CGFloat(outputPoint.x), y:CGFloat(outputPoint.y))
    }
    
    func computeExtentAfterTransforming(_ extent:CGRect, with homography:matrix_float3x3) -> CGRect
    {
      let points = [transform(extent.origin, by: homography),
                    transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y), by: homography),
                    transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y + extent.height), by: homography),
                    transform(CGPoint(x: extent.origin.x, y:extent.origin.y + extent.height), by: homography)]
    
      var (xmin, xmax, ymin, ymax) = (points[0].x, points[0].x, points[0].y, points[0].y)
      points.forEach { p in
        xmin = min(xmin, p.x)
        xmax = max(xmax, p.x)
        ymin = min(ymin, p.y)
        ymax = max(ymax, p.y)
      }
      let result = CGRect(x: xmin, y:ymin, width: xmax-xmin, height: ymax-ymin)
      return result
    }
    
    let warpedExtent = computeExtentAfterTransforming(ciFloatingImage.extent, with: homography.inverse)
    let outputExtent = warpedExtent.union(ciFloatingImage.extent)
    

    Now you can create a warped version of your floating image:

    let ciFloatingImage = CIImage(image: floatingImage)
    let ciWarpedImage = warpKernel.apply(extent: outputExtent, roiCallback:
        {
            (index, rect) in
            return computeExtentAfterTransforming(rect, with: homography.inverse)
        },
        image: inputImage,
        arguments: [homographyCIVector])!
    

    The roiCallback is there to tell CoreImage which part of the input image is needed to compute a certain part of the output. CoreImage uses this to apply the shader on parts of the image block by block, such that it can process huge images. (See Creating Custom Filters in Apple's docs). A quick hack would be to always return CGRect.infinite here, but then CoreImage can't do any block-wise magic.

    And lastly, create a composite image of the reference image and the warped image:

    let ciReferenceImage = CIImage(image: referenceImage)
    let ciResultImage = ciWarpedImage.composited(over: ciReferenceImage)
    let resultImage = UIImage(ciImage: ciResultImage)
    
    0 讨论(0)
提交回复
热议问题