Tackle low FPS for correlation code to compute shift in Image

问题

I am trying to track an object using correlation. I am finding a smaller patch in a larger image, frame by frame. For this, I am finding the shift in the patch, and where correlation is maximum, update the patch with a new patch.

My code is:

cv::Mat im_float_2,imagePart_out;
cv::Mat im_floatBig;
cv::Scalar im1_Mean, im1_Std, im2_Mean, im2_Std;

double covar, correl;
int n_pixels;

void computeShift()
{
    int maxRow=0, maxCol=0, TX, TY;
    double GMAX=0;
    Mat image_window = Mat::zeros(imagePart.rows, imagePart.cols, CV_32F);

    imagePart.convertTo(im_float_2, CV_32F);
    imageBig.convertTo(im_floatBig,CV_32F);

    for(maxRow=0; maxRow<=imageBig.rows-image_window.rows;maxRow++)
    {
        for(maxCol=0; maxCol<imageBig.cols-image_window.cols;maxCol++)
        {

            image_window = im_floatBig( cv::Rect( maxCol, maxRow, 
            image_window.cols, image_window.rows ) );

            n_pixels = image_window.rows * image_window.cols;

            // Compute mean and standard deviation of both images

            meanStdDev(image_window, im1_Mean, im1_Std);
            meanStdDev(im_float_2, im2_Mean, im2_Std);

            // Compute covariance and correlation coefficient
            covar = (image_window - im1_Mean).dot(im_float_2 - im2_Mean) / n_pixels;

            correl = covar / (im1_Std[0] * im2_Std[0]);
            if (correl > GMAX)
            {
            GMAX = correl; TX = maxRow; TY=maxCol;
            image_window.convertTo(imagePart, CV_8UC1);
            }
        }
    }

            cvtColor(imagePart, imagePart_out, CV_GRAY2BGR);
            printf("\nComputed shift: [%d, %d] MAX: %f\n", TX, TY,GMAX);                

}

But when executing this I am getting very low FPS(1-2) even for small video size (Frame size-262x240, Patch size- 25x25).

Is there any way to achieve higher FPS. I am also looking in the direction of phase correlation, but not sure how to go about it from here. Can converting it to frequency domain will help?

For now, I want to optimize the above code for speed.

回答1:

Yes, you will likely gain from using the FFT. Simply pad im_float_2 to the size of im_floatBig. Multiplying in the Fourier domain after taking the complex conjugate of one of the transforms leads to the cross-correlation, which is not the same as your correl value (there is no division by the standard deviations happening). But I don't think you actually need to normalize by the standard deviations to get a good template matching. The cross-correlation works really well by itself. The location of maximum in the result can be translated to a displacement of the template w.r.t. the image.

The steps for cross-correlation through the FFT are:

Pad the template (floating image) to the size of the other image (with zeros).
Compute the FFT of both.
Flip the sign of the imaginary component of one of the results (complex conjugate).
Multiply the two.
Compute the IFFT of the result.
Find the location of the pixel with the largest value.

The location of this pixel indicates the translation of the padded template w.r.t. the other image. If they best match without translation, the max pixel will be at (x,y)=(0,0). If it is at (1,0) it indicates a one-pixel shift along x. What the direction is depends on which of the two you computed the complex conjugate for. Note that this result is periodic, a one-pixel shift in the opposite direction is indicated by the max pixel being on the right edge of the image. Simply experiment a bit to determine how to translate the location to a shift of your template.

Regarding your code:

meanStdDev(im_float_2, im2_Mean, im2_Std); is computed in the loop, even though im_float_2 doesn't change.
But you could get away with not normalizing by it anyway, since you're just looking for the maximum correlation, and dividing all values in your search by the same number doesn't change which one is the largest. The same applies to the division by n_pixels.
Move image_window.convertTo(imagePart, CV_8UC1) outside the loop. It is likely that you update your current max many times before you finally find the actual max. There is no point in converting so many sub-windows to CV_U8, if you only end up using the last one. Inside the loop you update the (x,y) coordinates of the max. Cast the final location only.
You probably don't need to search the whole image for your template. It is likely that the object moves only a relatively small amount. You should look only in a small region around the previous known location. This concept is applicable to the FFT method as well: crop out a region of your big image, and pad your template to that size. A smaller FFT is cheaper to compute.
OpenCV stores images row-wise. Put the loop over the rows as the inner loop to optimize your cache usage.

来源：https://stackoverflow.com/questions/52753902/tackle-low-fps-for-correlation-code-to-compute-shift-in-image

标签

c++

image-processing

correlation

cross-correlation