Can all 32 bit ints be exactly represented as a double? [duplicate]

我怕爱的太早我们不能终老 提交于 2019-12-11 02:56:07

问题


Possible Duplicate:
Which is the first integer that an IEEE 754 float is incapable of representing exactly?

This is basic question, my feeling is that the answer is yes(int = 32 bits, double = 53 bit mantisa + sign bit).

Basically can asserts fire?

int x = get_random_int();
double dx = x;
int x1 = (int) dx;
assert(x1 ==x);
if  (INT_MAX-10>x)
 {
       dx+=10;
       int x2=(int) dx;
       assert(x+10 == x2);
 }

Obviously stuff involving complicated expressions with divisions and similar stuff ( (int)(5.0/3*3) is not the same as 5/3*3)wont work, but I wonder do conversions and adition/substraction(if no overflow occurs) preserve equivalence.


回答1:


If the number of bits in the mantissa is >= the number of bits in the integer, then the answer is yes. In your question you give specific, known sizes for int and the mantissa of double, but it's useful to know that this is not guaranteed by the 2003 C++ standard, which says nothing about the relative sizes of int and double's mantissa.

Note that C and C++ are not required to use IEEE 754 floating-point arithmetic. According to 3.8.1/8 of the 2003 C++ standard,

The value representation of floating-point types is implementation-defined.

In fact C++ allows floating point representations that don't even use binary mantissas. For C, #including <limits.h> can be used to infer information about fundamental types. In particular, if FLT_RADIX raised to the power DBL_MANT_DIG is greater than or equal to INT_MAX, then all int values can be represented exactly. In C++, the relevant quantities are named numeric_limits<double>::radix, numeric_limits<double>::digits and numeric_limits<int>::max().

Given two integer operands and an operation that always produces an integer from integer operands (such as + or *, but not /), all IEEE 754 rounding modes will produce an integer exactly. If this integer is representable in an int (and therefore exactly representable in a double, given our assumption that its mantissa is at least as wide as an int), then it will be the same integer you would get by using the corresponding integer operation. Any sensible FP implementation will preserve the above guarantees, even if it is not IEEE 754 compliant.




回答2:


Yes. All N bit ints can be represented in a floating point representation that has at least N-1 mantissa bits (because of the implicit leading 1 bit that doesn't need to be stored) and an exponent that can store at least N, i.e. has log(N)+1 bits.

So you can store an int32_t in a floating point value with 31 bits of mantissa, five bits of exponent, and one sign bit, which fits in a typical double but not a float. Conversely, a float with only 24 bits of mantissa can only accurately store ints with up to 25 bits, i.e. +/-33,554,431.



来源:https://stackoverflow.com/questions/13269523/can-all-32-bit-ints-be-exactly-represented-as-a-double

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!