Many CPUs have single assembly opcodes for returning the high order bits of a 32 bit integer multiplication. Normally multiplying two 32 bit integers produc
gcc 4.3.2, with -O1 optimisation or higher, translated your function exactly as you showed it to IA32 assembly like this:
umulhi32:
pushl %ebp
movl %esp, %ebp
movl 12(%ebp), %eax
mull 8(%ebp)
movl %edx, %eax
popl %ebp
ret
Which is just doing a single 32 bit mull
and putting the high 32 bits of the result (from %edx
) into the return value.
That's what you wanted, right? Sounds like you just need to turn up the optimisation on your compiler ;) It's possible you could push the compiler in the right direction by eliminating the intermediate variable:
unsigned int umulhi32(unsigned int x, unsigned int y)
{
return (unsigned int)(((unsigned long long)x * y)>>32);
}