Performance of branch prediction in a loop

你。 提交于 2019-12-06 09:23:46

问题


Would there be any noticeable speed difference between these two snippets of code? Naively, I think the second snippet would be faster because branch instructions are encountered a lot less, but on the other hand the branch predictor should solve this problem. Or will it have a noticeable overhead despite the predictable pattern? Assume that no conditional move instruction is used.

Snippet 1:

for (int i = 0; i < 100; i++) {
    if (a == 3)
        output[i] = 1;
    else
        output[i] = 0;
}

Snippet 2:

if (a == 3) {
    for (int i = 0; i < 100; i++)
        output[i] = 1;
} else {
    for (int i = 0; i < 100; i++)
        output[i] = 0;
}

I'm not intending to optimise these cases myself, but I would like to know more about the overhead of branches even with a predictable pattern.


回答1:


Since a remains unchanged once you enter into the loop, there shouldn't be much difference between the two code-snippet.

Personally, I would prefer the former, unless branch predictor fails to predict the branch which is really unlikely, given that a remains unchanged in the loop.

Moreover, the compiler may perform this optimization:

  • Loop unswitching

thereby making both code-snippets emit exactly same machine instructions.




回答2:


You asked a performance question without specifying hardware (although from the question we can infer that it's one of the architectures that have branch prediction), toolchain, or compile options.

Overall, this is just another space vs speed tradeoff, where space often itself affects speed (CPU instruction and microcode caches).

The only reasonable answer is "Performance will vary depending on processor hardware and compiler optimizations."



来源:https://stackoverflow.com/questions/12251160/performance-of-branch-prediction-in-a-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!