Multi-dimensional nested OpenMP loop

我是研究僧i 提交于 2020-01-13 08:23:22

问题


What is the proper way to parallelize a multi-dimensional embarrassingly parallel loop in OpenMP? The number of dimensions is known at compile-time, but which dimensions will be large is not. Any of them may be one, two, or a million. Surely I don't want N omp parallel's for an N-dimensional loop...

Thoughts:

  • The problem is conceptually simple. Only the outermost 'large' loop needs to be parallelized, but the loop dimensions are unknown at compile-time and may change.

  • Will dynamically setting omp_set_num_threads(1) and #pragma omp for schedule(static, huge_number) make certain loop parallelizations a no-op? Will this have undesired side-effects/overhead? Feels like a kludge.

  • The OpenMP Specification (2.10, A.38, A.39) tells the difference between conforming and non-conforming nested parallelism, but doesn't suggest the best approach to this problem.

  • Re-ordering the loops is possible but may result in a lot of cache-misses. Unrolling is possible but non-trivial. Is there another way?

Here's what I'd like to parallelize:

for(i0=0; i0<n[0]; i0++) {
  for(i1=0; i1<n[1]; i1++) {
    ...
       for(iN=0; iN<n[N]; iN++) {
         <embarrasingly parallel operations>
       }
    ...
  }
}

Thanks!


回答1:


The collapse directive is probably what you're looking for, as described here. This will essentially form a single loop, which is then parallized, and is designed for exactly these sorts of situations. So you'd do:

#pragma omp parallel for collapse(N)
for(int i0=0; i0<n[0]; i0++) {
  for(int i1=0; i1<n[1]; i1++) {
    ...
       for(int iN=0; iN<n[N]; iN++) {
         <embarrasingly parallel operations>
       }
    ...
  }
}

and be all set.



来源:https://stackoverflow.com/questions/5287321/multi-dimensional-nested-openmp-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!