I want to implement a logical operation that works as efficient as possible. I need this truth table:
p q p → q
T T T
T F F
F T T
F
Another solution for C booleans (a bit dirty, but works):
((unsigned int)(p) <= (unsigned int)(q))
It works since by the C standard, 0
represents false, and any other value true (1
is returned for true by boolean operators, int
type).
The "dirtiness" is that I use booleans (p
and q
) as integers, which contradicts some strong typing policies (such as MISRA), well, this is an optimization question. You may always #define
it as a macro to hide the dirty stuff.
For proper boolean p
and q
(having either 0
or 1
binary representations) it works. Otherwise T->T
might fail to produce T
if p
and q
have arbitrary nonzero values for representing true.
If you need to store the result only, since the Pentium II, there is the cmovcc
(Conditional Move) instruction (as shown in Derobert's answer). For booleans, however even the 386 had a branchless option, the setcc
instruction, which produces 0
or 1
in a result byte location (byte register or memory). You can also see that in Derobert's answer, and this solution also compiles to a result involving a setcc
(setbe
: Set if below or equal).
Derobert and Chris Dolan's ~p | q
variant should be the fastest for processing large quantities of data since it can process the implication on all bits of p
and q
individually.
Notice that not even the !p || q
solution compiles to branching code on the x86: it uses setcc
instructions. That's the best solution if p
or q
may contain arbitrary nonzero values representing true. If you use the _Bool
type, it will generate very few instructions.
I got the following figures when compiling for the x86:
__attribute__((fastcall)) int imp1(int a, int b)
{
return ((unsigned int)(a) <= (unsigned int)(b));
}
__attribute__((fastcall)) int imp2(int a, int b)
{
return (!a || b);
}
__attribute__((fastcall)) _Bool imp3(_Bool a, _Bool b)
{
return (!a || b);
}
__attribute__((fastcall)) int imp4(int a, int b)
{
return (~a | b);
}
Assembly result:
00000000 :
0: 31 c0 xor %eax,%eax
2: 39 d1 cmp %edx,%ecx
4: 0f 96 c0 setbe %al
7: c3 ret
00000010 :
10: 85 d2 test %edx,%edx
12: 0f 95 c0 setne %al
15: 85 c9 test %ecx,%ecx
17: 0f 94 c2 sete %dl
1a: 09 d0 or %edx,%eax
1c: 0f b6 c0 movzbl %al,%eax
1f: c3 ret
00000020 :
20: 89 c8 mov %ecx,%eax
22: 83 f0 01 xor $0x1,%eax
25: 09 d0 or %edx,%eax
27: c3 ret
00000030 :
30: 89 d0 mov %edx,%eax
32: f7 d1 not %ecx
34: 09 c8 or %ecx,%eax
36: c3 ret
When using the _Bool
type, the compiler clearly exploits that it only has two possible values (0
for false and 1
for true), producing a very similar result to the ~a | b
solution (the only difference being that the latter performs a complement on all bits instead of just the lowest bit).
Compiling for 64 bits gives just about the same results.
Anyway, it is clear, the method doesn't really matter from the point of avoiding producing conditionals.