Efficient computation of the high order bits of a 32 bit integer multiplication

前端 未结 3 1202
青春惊慌失措
青春惊慌失措 2021-01-04 19:24

Many CPUs have single assembly opcodes for returning the high order bits of a 32 bit integer multiplication. Normally multiplying two 32 bit integers produc

3条回答
  •  花落未央
    2021-01-04 19:31

    gcc 4.3.2, with -O1 optimisation or higher, translated your function exactly as you showed it to IA32 assembly like this:

    umulhi32:
            pushl   %ebp
            movl    %esp, %ebp
            movl    12(%ebp), %eax
            mull    8(%ebp)
            movl    %edx, %eax
            popl    %ebp
            ret
    

    Which is just doing a single 32 bit mull and putting the high 32 bits of the result (from %edx) into the return value.

    That's what you wanted, right? Sounds like you just need to turn up the optimisation on your compiler ;) It's possible you could push the compiler in the right direction by eliminating the intermediate variable:

    unsigned int umulhi32(unsigned int x, unsigned int y)
    {
      return (unsigned int)(((unsigned long long)x * y)>>32);
    }
    

提交回复
热议问题