Is there a compiler hint for GCC to force branch prediction to always go a certain way?

前端 未结 8 1582
日久生厌
日久生厌 2020-11-28 02:06

For the Intel architectures, is there a way to instruct the GCC compiler to generate code that always forces branch prediction a particular way in my code? Does the Intel h

相关标签:
8条回答
  • 2020-11-28 02:53

    No, there is not. (At least on modern x86 processors.)

    __builtin_expect mentioned in other answers influences the way gcc arranges the assembly code. It does not directly influence the CPU's branch predictor. Of course, there will be indirect effects on branch prediction caused by reordering the code. But on modern x86 processors there is no instruction that tells the CPU "assume this branch is/isn't taken".

    See this question for more detail: Intel x86 0x2E/0x3E Prefix Branch Prediction actually used?

    To be clear, __builtin_expect and/or the use of -fprofile-arcs can improve the performance of your code, both by giving hints to the branch predictor through code layout (see Performance optimisations of x86-64 assembly - Alignment and branch prediction), and also improving cache behaviour by keeping "unlikely" code away from "likely" code.

    0 讨论(0)
  • 2020-11-28 02:57

    __builtin_expect can be used to tell the compiler which way you expect a branch to go. This can influence how the code is generated. Typical processors run code faster sequentially. So if you write

    if (__builtin_expect (x == 0, 0)) ++count;
    if (__builtin_expect (y == 0, 0)) ++count;
    if (__builtin_expect (z == 0, 0)) ++count;
    

    the compiler will generate code like

    if (x == 0) goto if1;
    back1: if (y == 0) goto if2;
    back2: if (z == 0) goto if3;
    back3: ;
    ...
    if1: ++count; goto back1;
    if2: ++count; goto back2;
    if3: ++count; goto back3;
    

    If your hint is correct, this will execute the code without any branches actually performed. It will run faster than the normal sequence, where each if statement would branch around the conditional code and would execute three branches.

    Newer x86 processors have instructions for branches that are expected to be taken, or for branches that are expected not to be taken (there's an instruction prefix; not sure about the details). Not sure if the processor uses that. It is not very useful, because branch prediction will handle this just fine. So I don't think you can actually influence the branch prediction.

    0 讨论(0)
提交回复
热议问题