Reduction with OpenMP: linear merging or log(number of threads) merging

前端 未结 1 1049
悲&欢浪女
悲&欢浪女 2021-01-25 02:25

I have a general question about reductions with OpenMP that\'s bothered me for a while. My question is in regards to merging the partial sums in a reduction. It can either be

相关标签:
1条回答
  • 2021-01-25 03:30

    The OpenMP implementation will make a decision about the best way to do the reduction based on the implementor's knowledge of the specific characteristics of the hardware it's running on. On system with a small number of CPUs, it will probably do a linear reduction. On a system with hundreds or thousands of cores (e.g. GPU, Intel Phi) it will likely do a log(n) reduction.

    The time spent in the reduction might not matter for very large problems, but for smaller problems it could be add a few percent to the total runtime. Your implementation might be just as fast in many cases, but I doubt it would ever be faster, so why not let OpenMP decide on the optimal reduction strategy?

    0 讨论(0)
提交回复
热议问题