I\'m looking for fast code for 64-bit (unsigned) cube roots. (I\'m using C and compiling with gcc, but I imagine most of the work required will be language- and compiler-agnost
You could try a Newton's step to fix your rounding errors:
ulong r = (ulong)pow(n, 1.0/3);
if(r==0) return r; /* avoid divide by 0 later on */
ulong r3 = r*r*r;
ulong slope = 3*r*r;
ulong r1 = r+1;
ulong r13 = r1*r1*r1;
/* making sure to handle unsigned arithmetic correctly */
if(n >= r13) r+= (n - r3)/slope;
if(n < r3) r-= (r3 - n)/slope;
A single Newton step ought to be enough, but you may have off-by-one (or possibly more?) errors. You can check/fix those using a final check&increment step, as in your OQ:
while(r*r*r > n) --r;
while((r+1)*(r+1)*(r+1) <= n) ++r;
or some such.
(I admit I'm lazy; the right way to do it is to carefully check to determine which (if any) of the check&increment things is actually necessary...)