OpenMP drastically slows down for loop

后端 未结 1 1647
生来不讨喜
生来不讨喜 2021-01-15 13:42

I am attempting to speed up this for loop with OpenMP parallelization. I was under the impression that this should split up the work across a number of threads. However, p

相关标签:
1条回答
  • 2021-01-15 14:22

    Assuming you don't have a race condition you can try fusing the loops. Fusing will give larger chunks to parallelize which will help reduce the effect of false sharing and likely distribute the load better as well.

    For a triple loop like this

    for(int i2=0; i2<x; i2++) {
        for(int j2=0; j2<y; j2++) {
            for(int k2=0; k2<z; k2++) {
                //
            }
        }
    }
    

    you can fuse it like this

    #pragma omp parallel for
    for(int n=0; n<(x*y*z); n++) {
        int i2 = n/(y*z);
        int j2 = (n%(y*z))/z;
        int k2 = (n%(y*z))%z;
        //
    }
    

    In your case you you can do it like this

    int i, j, k, n;
    int x = newNx%2 ? newNx/2+1 : newNx/2;
    int y = newNy;
    int z = newNz;
    
    #pragma omp parallel for private(i, j, k)
    for(n=0; n<(x*y*z); n++) {
        i = 2*(n/(y*z)) + 1;
        j = (n%(y*z))/z + 1;
        k = (n%(y*z))%z + 1;
        // rest of code
    }
    

    If this successfully speed up your code then you can feel good that you made your code faster and at the same time obfuscated it even further.

    0 讨论(0)
提交回复
热议问题