Gaussian Blur with FFT Questions

后端 未结 2 692
执念已碎
执念已碎 2021-02-10 00:02

I have a current implementation of Gaussian Blur using regular convolution. It is efficient enough for small kernels, but once the kernels size gets a little bigger, the perform

2条回答
  •  故里飘歌
    2021-02-10 00:39

    1. The 2-D FFT is seperable and you are correct in how to perform it except that you must multiply by the 2-D FFT of the 2D kernel. If you are using kissfft, an easier way to perform the 2-D FFT is to just use kiss_fftnd in the tools directory of the kissfft package. This will do multi-dimensional FFTs.

    2. The kernel size does not have to be any particular size. If the kernel is smaller than the image, you just need to zero-pad up to the image size before performing the 2-D FFT. You should also zero pad the image edges since the convoulution being performed by multiplication in the frequency domain is actually circular convolution and results wrap around at the edges.

    So to summarize (given that the image size is M x N):

    1. come up with a 2-D kernel of any size (U x V)
    2. zero-pad the kernel up to (M+U-1) x (N+V-1)
    3. take the 2-D fft of the kernel
    4. zero-pad the image up to (M+U-1) x (N+V-1)
    5. take the 2-D FFT of the image
    6. multiply FFT of kernel by FFT of image
    7. take inverse 2-D FFT of result
    8. trim off garbage at edges

    If you are performing the same filter multiple times on different images, you don't have to perform 1-3 every time.

    Note: The kernel size will have to be rather large for this to be faster than direct computation of convolution. Also, did you implement your direct convolution taking advantage of the fact that a 2-D gaussian filter is separable (see this a few paragraphs into the "Mechanics" section)? That is, you can perform the 2-D convolution as 1-D convolutions on the rows and then the columns. I have found this to be faster than most FFT-based approaches unless the kernels are quite large.

    Response to Edit

    1. If the input is real, the output will still be complex except for rare circumstances. The FFT of your gaussian kernel will also be complex, so the multiply must be a complex multiplication. When you perform the inverse FFT, the output should be real since your input image and kernel are real. The output will be returned in a complex array, but the imaginary components should be zero or very small (floating point error) and can be discarded.

    2. If you are using the same image, you can reuse the image FFT, but you will need to zero pad based on your biggest kernel size. You will have to compute the FFTs of all of the different kernels.

    3. For visualization, the magnitude of the complex output should be used. The log scale just helps to visualize smaller components of the output when larger components would drown them out in a linear scale. The Decibel scale is often used and is given by either 20*log10(abs(x)) or 10*log10(x*x') which are equivalent. (x is the complex fft output and x' is the complex conjugate of x).

    4. The input and output of the FFT will be the same size. Also the real and imaginary parts will be the same size since one real and one imaginary value form a single sample.

提交回复
热议问题