I\'m calculating fixedpoint reciprocals in Q22.10 with Goldschmidt division for use in my software rasterizer on ARM.
This is done by just setting the numerator to 1, i.
Mads, you are not losing any precision at all. When you divide 512.00002f by 2^10, you merely decrease the exponent of your floating point number by 10. Mantissa remains the same. Of course unless the exponent hits its minimum value but that shouldn't happen since you're scaling to (0.5, 1].
EDIT: Ok so you're using a fixed decimal point. In that case you should allow a different representation of the denominator in your algorithm. The value of D is from (0.5, 1] not only at the beginning but throughout the whole calculation (it's easy to prove that x * (2-x) < 1 for x < 1). So you should represent the denominator with decimal point at base = 32. This way you will have 32 bits of precision all the time.
EDIT: To implement this you'll have to change the following lines of your code:
//bitpos = 31 - clz(val) - BASE;
bitpos = 31 - clz(val) - 31;
...
//F = (2ULL< >BASE;
F = -D;
N = F >> (31 - BASE);
D = ((unsigned long long)D*F)>>31;
...
//F = (2<<(BASE)) - D;
//D = ((unsigned long long)D*F)>>BASE;
F = -D;
D = ((unsigned long long)D*F)>>31;
...
//N = ((unsigned long long)N*F)>>BASE;
N = ((unsigned long long)N*F)>>31;
Also in the end you'll have to shift N not by bitpos but some different value which I'm too lazy to figure out right now :).