Branch on ?: operator? | 易学教程

问题

For a typical modern compiler on modern hardware, will the ? : operator result in a branch that affects the instruction pipeline?

In other words which is faster, calling both cases to avoid a possible branch:

bool testVar = someValue(); // Used later.
purge(white);
purge(black);

or picking the one actually needed to be purged and only doing it with an operator ?::

bool testVar = someValue();
purge(testVar ? white : black);

I realize you have no idea how long purge() will take, but I'm just asking a general question here about whether I would ever want to call purge() twice to avoid a possible branch in the code.

I realize this is a very tiny optimization and may make no real difference, but would still like to know. I expect the ?: does not result in branching, but want to make sure my understanding is correct.

回答1:

Depends on the platform. Specifically, it depends on the size of jump prediction table of the CPU and whether the CPU allows conditional operations (like on ARM).

CPUs with conditional operations will strongly favor the second case. CPUs with bigger jump prediction tables will favor the first case.

The real answer (like with any other performance questions): measure and compare. Sometimes the rest of the code throws a curve ball and it's usually impossible to predict effects of some changes.

回答2:

The CMOV (Conditional MOVe) instruction has been part of the x86 instruction set since the Pentium Pro. It is rarely automatically generated by GCC because of compiler options commonly used and restrictions placed by the C language. A SETCC/CMOV sequence can be inserted by inline assembly in your C program. This should only be done is cases where the conditional variable is a randomly oscillating value in the inner loop (millions of executions) of a program. In non-oscillating cases and in cases of simple patterns of oscillation, modern processors can predict branches with a very high degree of accuracy. In 2007, Linus Torvalds suggested here to avoid use of CMOV in most situations.

Intel describes the conditional move in the Intel(R) Architecture Software Developer's Manual, Volume 2: Instruction Set Reference Manual:

The CMOVcc instructions check the state of one or more of the status flags in the EFLAGS
register (CF, OF, PF, SF, and ZF) and perform a move operation if the flags are in a specified
state (or condition). A condition code (cc) is associated with each instruction to indicate the
condition being tested for. If the condition is not satisfied, a move is not performed and execution
continues with the instruction following the CMOVcc instruction.

These instructions can move a 16- or 32-bit value from memory to a general-purpose register or
from one general-purpose register to another. Conditional moves of 8-bit register operands are
not supported.

The conditions for each CMOVcc mnemonic is given in the description column of the above
table. The terms “less” and “greater” are used for comparisons of signed integers and the terms
“above” and “below” are used for unsigned integers.

Because a particular state of the status flags can sometimes be interpreted in two ways, two
mnemonics are defined for some opcodes. For example, the CMOVA (conditional move if
above) instruction and the CMOVNBE (conditional move if not below or equal) instruction are
alternate mnemonics for the opcode 0F 47H.

回答3:

I can't imagine the first method would ever be faster.

With the first method you may avoid a branch, but you replace it with a function call, which would usually involve a branch plus a lot more (unless it was inlined). Even if inlined, unless the functionality inside the purge() function was absolutely trivial it would almost certainly be slower.

回答4:

Calling a function is at least as expensive as doing a logic test + jump (and yes, the ? : ternary operator would require a jump).

回答5:

in the first case purge is called twice. In the second case purge is called once

Its hard to answer the question about branching because its so dependent on compilers and instruction set. For example on an ARM (which has conditional instruction execution) it might not branch. ON an x86 it almost certainly will

来源：https://stackoverflow.com/questions/7127759/branch-on-operator

标签

c++

compiler-construction

hardware

micro-optimization