Fastest method for calculating convolution

谁说胖子不能爱 提交于 2019-12-03 00:08:49
chappjc

If your kernel is separable, the greatest speed gains will be realized by performing multiple sequential 1D convolutions.

Steve Eddins of MathWorks describes how to take advantage of the associativity of convolution to speed up convolution when the kernel is separable in a MATLAB context on his blog. For a P-by-Q kernel, the computational advantage of performing two separate and sequential convolutions vs. 2D convolution is PQ/(P+Q), which corresponds to 4.5x for a 9x9 kernel and ~11x for a 15x15 kernel. EDIT: An interesting unwitting demonstration of this difference was given in this Q&A.

To figure out if the kernel is separable (i.e. the outer product of two vectors) the blog goes on to describe how to check if your kernel is separable with SVD and how to get the 1D kernels. Their example is for a 2D kernel. For a solution for N-dimensional separable convolution, check this FEX submission.


Another resource worth pointing out is this SIMD (SSE3/SSE4) implementation of 3D convolution by Intel, which includes both source and a presentation. The code is for 16 bit integers. Unless you move to GPU (e.g. cuFFT), it is probably hard to get faster than Intel's implementations, which also includes Intel MKL. There is an example of 3D convolution (single-precision float) at the bottom of this page of the MKL documentation (link fixed, now mirrored in https://stackoverflow.com/a/27074295/2778484).

You could try the overlap-add and overlap-save methods. They involve breaking up your input signal into smaller chunks and then using either of the above methods.

An FFT is most likely - and I could be wrong - the fastest method, especially if you are using built-in routines in MATLAB or a library in C++. Apart from that, breaking the input signal into smaller chunks should be a good bet.

i have 2 way to calc fastconv

and 2 betther than 1

1- armadillo you can use armadillo library for calcing conv with this code

cx_vec signal(1024,fill::randn);
cx_vec code(300,fill::randn);
cx_vec ans = conv(signal,code);

2-use fftw ans sigpack and armadillo library for calcing fast conv in this way you must init fft of your code in constructor

FastConvolution::FastConvolution(cx_vec inpCode)
{
    filterCode = inpCode;
    fft_w = NULL;
}


cx_vec FastConvolution::filter(cx_vec inpData)
{
int length = inpData.size()+filterCode.size();
    if((length & (length - 1)) == 0)
    {

    }
    else
    {
        length = pow(2 , (int)log2(length) + 1);
    }
    if(length != fftCode.size())
        initCode(length);

    static cx_vec zeroPadedData;
    if(length!= zeroPadedData.size())
    {
        zeroPadedData.resize(length);
    }
    zeroPadedData.fill(0);
    zeroPadedData.subvec(0,inpData.size()-1) = inpData;


    cx_vec fftSignal = fft_w->fft_cx(zeroPadedData);
    cx_vec mullAns = fftSignal % fftCode;
    cx_vec ans = fft_w->ifft_cx(mullAns);
    return ans.subvec(filterCode.size(),inpData.size()+filterCode.size()-1);
}

void FastConvolution::initCode(int length)
{
    if(fft_w != NULL)
    {
        delete fft_w;
    }
    fft_w = new sp::FFTW(length,FFTW_ESTIMATE);
    cx_vec conjCode(length,fill::zeros);
    fftCode.resize(length);
    for(int i = 0; i < filterCode.size();i++)
    {
        conjCode.at(i) = filterCode.at(filterCode.size() - i - 1);
    }
    conjCode = conj(conjCode);
    fftCode = fft_w->fft_cx(conjCode);
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!