问题
I just read about the IEEE 754 standard in order to understand how single-precision and double-precision floating points are implemented.
So I wrote this to check my understanding:
#include <stdio.h>
#include <float.h>
int main() {
double foo = 9007199254740992; // 2^53
double bar = 9007199254740993; // 2^53 + 1
printf("%d\n\n", sizeof(double)); // Outputs 8. Good
printf("%f\n\n", foo); // 9007199254740992.000000. Ok
printf("%f\n", bar); // 9007199254740992.000000. Ok because Mantissa is 52 bits
printf("%f\n\n", DBL_MAX); // ??
return 0;
}
Output:
8
9007199254740992.000000
9007199254740992.000000
179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000
What I don't understand is that I expected the last line of my output to be: (2^53-1) * 2^(1024-52), but the number on the last line corresponds approximately to 2^(2^10). What am I missing? How DBL_MAX
is calculated exactly?
EDIT: Little explanation about the exact value of DBL_MAX
:
As explained in the accepted answer the largest value of the exponent is 2^1023 and not 2^1024 as I tought. So the exact value of DBL_MAX
is:
(2^53-1)*(2^(1023-52))
(so as expected it's slightly smaller than 2^10 since the mantissa is a bit smaller than 2)
回答1:
Double are represented as m*2^e
where m
is the mantissa and e
is the exponent. Doubles have 11 bits for the exponent. Since the exponent can be negative there is an offset of 1023
. That means that the real calculation is m*2^(e-1023)
. The largest 11 bit number is 2047
. The exponent 2047
is reserved for storing inf
and NaN
. This means the largest double is m*2^(2046-1023) = m*2^(1023)
. The mantissa is a number between 1 and 2. This means that the largest double is attained when m
is almost 2. So we have:
DBL_MAX = max(m)*2^1023 ~ 2*2^1023 = 2^1024 = 2^(2^10)
As you can see here this is pretty much the standard value of DBL_MAX
.
回答2:
DBL_MAX
is the largest value a double can hold. Its value is not related to the number of bits in the mantissa.
The limit is mostly related to the maximum exponent. For IEEE-754, it is about 1.8e+308 or 2^1023.
The definition is usually #define DBL_MAX 1.79769313486231470e+308
来源:https://stackoverflow.com/questions/30064616/understanding-dbl-max