This is only an issue on GCC versions prior to 4.4, this was fixed in GCC 4.5.
Is it possible to tell the compiler the variable used in a switch fits within the provided
This question is certainly interesting from the standpoint of a missed compiler optimization that is seemingly obvious to us, and I did spend considerable time trying to come up with a straightforward solution, largely out of personal curiousity.
That said, I have to admit I am highly skeptical that this additional instruction will ever result in a measurable performance difference in practice, especially on a new mac. If you have any significant amount of data, you'll be I/O bound, and a single instruction will never be your bottleneck. If you have a tiny amount of data, then you'll need to perform a lot lot lot of calculations repeatedly before a single instruction will become a bottleneck.
Would you post some code to show that there really is a performance difference? Or describe the code and data your working with?
Perhaps just use a default
label for the fist or last case?
Perhaps you could use an array of function pointers instead of a switch ?
#include <stdio.h>
typedef void (*func)(void);
static void f0(void) { printf("%s\n", __FUNCTION__); }
static void f1(void) { printf("%s\n", __FUNCTION__); }
static void f2(void) { printf("%s\n", __FUNCTION__); }
static void f3(void) { printf("%s\n", __FUNCTION__); }
static void f4(void) { printf("%s\n", __FUNCTION__); }
static void f5(void) { printf("%s\n", __FUNCTION__); }
static void f6(void) { printf("%s\n", __FUNCTION__); }
static void f7(void) { printf("%s\n", __FUNCTION__); }
int main(void)
{
const func f[8] = { f0, f1, f2, f3, f4, f5, f6, f7 };
int i;
for (i = 0; i < 8; ++i)
{
f[i]();
}
return 0;
}
Have you tried declaring the switch
variable as a bitfield?
struct Container {
uint16_t a:3;
uint16_t unused:13;
};
struct Container cont;
cont.a = 5; /* assign some value */
switch( cont.a ) {
...
}
Hope this works!
I tried compiling something simple and comparable with -O5 and -fno-inline (my f0-f7 functions were trivial) and it generated this:
8048420: 55 push %ebp ;; function preamble
8048421: 89 e5 mov %esp,%ebp ;; Yeah, yeah, it's a function.
8048423: 83 ec 04 sub $0x4,%esp ;; do stuff with the stack
8048426: 8b 45 08 mov 0x8(%ebp),%eax ;; x86 sucks, we get it
8048429: 83 e0 07 and $0x7,%eax ;; Do the (a & 0x7)
804842c: ff 24 85 a0 85 04 08 jmp *0x80485a0(,%eax,4) ;; Jump table!
8048433: 90 nop
8048434: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
8048438: 8d 45 08 lea 0x8(%ebp),%eax
804843b: 89 04 24 mov %eax,(%esp)
804843e: e8 bd ff ff ff call 8048400
8048443: 8b 45 08 mov 0x8(%ebp),%eax
8048446: c9 leave
Did you try playing with optimization levels?
I didn't try, but I'm not sure that gcc_unreachable
does the same thing as __builtin_unreachable
. Googling the two, gcc_unreachable
appears to be designed as a as an assertion tool for development of GCC itself, perhaps with a branch prediction hint included, whereas __builtin_unreachable
makes the program instantly undefined — which sounds like deleting the basic block, which is what you want.
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html#index-g_t_005f_005fbuiltin_005funreachable-3075