Reduction with OpenMP: linear merging or log(number of threads) merging

前端未结

关注

 1  1049

I have a general question about reductions with OpenMP that\'s bothered me for a while. My question is in regards to merging the partial sums in a reduction. It can either be

相关标签:

1条回答

遇见更好的自我

2021-01-25 03:30

The OpenMP implementation will make a decision about the best way to do the reduction based on the implementor's knowledge of the specific characteristics of the hardware it's running on. On system with a small number of CPUs, it will probably do a linear reduction. On a system with hundreds or thousands of cores (e.g. GPU, Intel Phi) it will likely do a log(n) reduction.

The time spent in the reduction might not matter for very large problems, but for smaller problems it could be add a few percent to the total runtime. Your implementation might be just as fast in many cases, but I doubt it would ever be faster, so why not let OpenMP decide on the optimal reduction strategy?

0 讨论(0)
发布评论:

提交评论
- 加载中...