Which is the first integer that an IEEE 754 float is incapable of representing exactly?

南笙酒味 提交于 2019-11-26 01:21:05

问题


For clarity, if I\'m using a language that implements IEE 754 floats and I declare:

float f0 = 0.f;
float f1 = 1.f;

...and then print them back out, I\'ll get 0.0000 and 1.0000 - exactly.

But IEEE 754 isn\'t capable of representing all the numbers along the real line. Close to zero, the \'gaps\' are small; as you get further away, the gaps get larger.

So, my question is: for an IEEE 754 float, which is the first (closest to zero) integer which cannot be exactly represented? I\'m only really concerned with 32-bit floats for now, although I\'ll be interested to hear the answer for 64-bit if someone gives it!

I thought this would be as simple as calculating 2bits_of_mantissa and adding 1, where bits_of_mantissa is how many bits the standard exposes. I did this for 32-bit floats on my machine (MSVC++, Win64), and it seemed fine, though.


回答1:


2mantissa bits + 1 + 1

The +1 in the exponent (mantissa bits + 1) is because, if the mantissa contains abcdef... the number it represents is actually 1.abcdef... × 2^e, providing an extra implicit bit of precision.

For float, it is 16,777,217 (224 + 1).
For double, it is 9,007,199,254,740,993 (253 + 1).

>>> 9007199254740993.0
9007199254740992



回答2:


The largest value representable by an n bit integer is 2n-1. As noted above, a float has 24 bits of precision in the significand which would seem to imply that 224 wouldn't fit.

However.

Powers of 2 within the range of the exponent are exactly representable as 1.0×2n, so 224can fit and consequently the first unrepresentable integer for float is 224+1. As noted above. Again.



来源:https://stackoverflow.com/questions/3793838/which-is-the-first-integer-that-an-ieee-754-float-is-incapable-of-representing-e

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!