re implement modulo using bit shifts?

后端 未结 5 1554
情歌与酒
情歌与酒 2021-02-07 16:10

I\'m writing some code for a very limited system where the mod operator is very slow. In my code a modulo needs to be used about 180 times per second and I figured that removing

5条回答
  •  春和景丽
    2021-02-07 16:40

    Actually division by constants is a well known optimization for compilers and in fact, gcc is already doing it.

    This simple code snippet:

    int mod(int val) {
       return val % 10;
    }
    

    Generates the following code on my rather old gcc with -O3:

    _mod:
            push    ebp
            mov     edx, 1717986919
            mov     ebp, esp
            mov     ecx, DWORD PTR [ebp+8]
            pop     ebp
            mov     eax, ecx
            imul    edx
            mov     eax, ecx
            sar     eax, 31
            sar     edx, 2
            sub     edx, eax
            lea     eax, [edx+edx*4]
            mov     edx, ecx
            add     eax, eax
            sub     edx, eax
            mov     eax, edx
            ret
    

    If you disregard the function epilogue/prologue, basically two muls (indeed on x86 we're lucky and can use lea for one) and some shifts and adds/subs. I know that I already explained the theory behind this optimization somewhere, so I'll see if I can find that post before explaining it yet again.

    Now on modern CPUs that's certainly faster than accessing memory (even if you hit the cache), but whether it's faster for your obviously a bit more ancient CPU is a question that can only be answered with benchmarking (and also make sure your compiler is doing that optimization, otherwise you can always just "steal" the gcc version here ;) ). Especially considering that it depends on an efficient mulhs (ie higher bits of a multiply instruction) to be efficient. Note that this code is not size independent - to be exact the magic number changes (and maybe also parts of the add/shifts), but that can be adapted.

提交回复
热议问题