I\'m looking for fast code for 64-bit (unsigned) cube roots. (I\'m using C and compiling with gcc, but I imagine most of the work required will be language- and compiler-agnost
If pow
is too expensive, you can use a count-leading-zeros instruction to get an approximation to the result, then use a lookup table, then some Newton steps to finish it.
int k = __builtin_clz(n); // counts # of leading zeros (often a single assembly insn)
int b = 64 - k; // # of bits in n
int top8 = n >> (b - 8); // top 8 bits of n (top bit is always 1)
int approx = table[b][top8 & 0x7f];
Given b
and top8
, you can use a lookup table (in my code, 8K entries) to find a good approximation to cuberoot(n)
. Use some Newton steps (see comingstorm's answer) to finish it.