How to generate a sse4.2 popcnt machine instruction

后端 未结 3 384
借酒劲吻你
借酒劲吻你 2021-02-02 16:40

Using the c program:

int main(int argc , char** argv)
{

  return  __builtin_popcountll(0xf0f0f0f0f0f0f0f0);

}

and the compiler line (gcc 4.4

3条回答
  •  有刺的猬
    2021-02-02 17:06

    You need to do it like this:

    #include 
    #include 
    
    int main(void)
    {
        int pop = _mm_popcnt_u64(0xf0f0f0f0f0f0f0f0ULL);
        printf("pop = %d\n", pop);
        return 0;
    }
    
    $ gcc -Wall -m64 -msse4.2 popcnt.c -o popcnt
    $ ./popcnt 
    pop = 32
    $ 
    

    EDIT

    Oops - I just checked the disassembly output with gcc 4.2 and ICC 11.1 - while ICC 11.1 correctly generates popcntl or popcntq, for some reason gcc does not - it calls ___popcountdi2 instead. Weird. I will try a newer version of gcc when I get a chance and see if it's fixed. I guess the only workaround otherwise is to use ICC instead of gcc.

提交回复
热议问题