Compute (a*b)%n FAST for 64-bit unsigned arguments in C(++) on x86-64 platforms?

前端 未结 5 645
走了就别回头了
走了就别回头了 2021-01-15 12:35

I\'m looking for a fast method to efficiently compute  (ab) modulo n  (in the mathematical sense of that) for

5条回答
  •  粉色の甜心
    2021-01-15 12:55

    Ok, how about this (not tested)

    modmul:
    ; rcx = a
    ; rdx = b
    ; r8 = n
    mov rax, rdx
    mul rcx
    div r8
    mov rax, rdx
    ret
    

    The precondition is that a * b / n <= ~0ULL, otherwise there will be a divide error. That's a slightly less strict condition than a < n && m < n, one of them can be bigger than n as long as the other is small enough.

    Unfortunately it has to be assembled and linked in separately, because MSVC doesn't support inline asm for 64bit targets.

    It's also still slow, the real problem is that 64bit div, which can take nearly a hundred cycles (seriously, up to 90 cycles on Nehalem for example).

提交回复
热议问题