Are there any preprocessor directives that control loop unrolling?

微笑、不失礼 提交于 2019-12-05 22:21:43

问题


Furthermore, how does the compiler determine the extent to unroll a loop, assuming all operations in the loop are completely independent of other iterations.


回答1:


For MSVC there is only a vector independence hint: http://msdn.microsoft.com/en-us/library/hh923901.aspx

#pragma loop( ivdep )

For many other compilers, like Intel/ibm, there a several pragma hints for optimizing a loop:

#pragma unroll
#pragma loop count N
#pragma ivdep

There is a thread with MSVC++ people about unroll heuristic: http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/d0b225c2-f5b0-4bb9-ac6a-4d4f61f7cb17/

VC tries to balance execution speed and code size. You can change the balance by using flags /O1 or /O2, but even when optimzing for speed VC tries to conserve code size as well.

Basically, unroll will increase code size, so it may be limited in Os and O1 modes (modes table)

PS: Pragma looks like preprocessor directive, but it is not. It is a directive for compiler and it it ignored (kept) by preprocessor.




回答2:


In the case of Intel Compiler:

#pragma loop count N helps the compiler to use the best strategy in order to vectorize the loop. It saves time So, we can say it helps to drive the loop unrolling. Examples:

#pragma loop_count min(n),max(n),avg(n)

#pragma unroll (n) works only when used with -O3 flag, you can use the following strategy to unroll your loop according to target processor.

Besides the increased code generated by loop unrolling, it may worth, since the compiler will produce loop's version for scalar operations as well for vector operations.

In cases where unrolling is affecting performance, for instance: loop with 20 iterations with vector length 16, results in 1 loop that executes 16 operations at once and a remainder loop that executes 4 sequentially. To avoid remainder loop generated by the compiler we can use before the loop:

#pragma vector novecremainder //or -mP2OPT_hpo_vec_peel = F to disable peel and remainder loops (compiler internal option)

or

#pragma nounroll //where unrolling is not worth at all 

Just to clarify the #pragma ivdep :

  • It gives specific hints to modify compiler heuristics about dependencies and must be used only when we know that the assumed dependencies are safe to ignore.
  • Most important, it overrides potential dependencies, but the compiler still performs a dependence analysis, try #pragma simd to vectorize regardless any analysis.

Hope this helps.



来源:https://stackoverflow.com/questions/12830450/are-there-any-preprocessor-directives-that-control-loop-unrolling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!