问题
I always thought that there are 2^64 different fractional values that can be stored by a variable of type double. (Each bit can have either 1 or 0 as value and so 2^64 different values).
Recently I came to know that NaN (not a number) has a representation in which exponent part is 11111111111 and significand part is any non-zero value. Instead, if it were like the representation is NaN if exponent part is 11111111111 and significand part is 111111......(52 times) ?
Won't this allow us to represent 2^52 more different numbers? And 2^52 is a huge number. So are we not wasting the valuable space?
回答1:
The IEEE-754 floating-point formats were designed with efficient hardware implementation in mind. All the special input operands can be detected by examining the exponent field only, which is either all-0
(zeros and denormals), or all-1
(infinities and NaNs). So for double precision specifically, only a 11-bit comparator is required, and the check can be performed in a fraction of a processor cycle.
Reserving one of 2048 possible exponent encodings for infinities and NaNs is not particularly wasteful. Note that IEEE-754 uses two different kind of NaNs: Signalling NaNs, or SNaNs, trigger an exception when encountered, while quite NaNs, or QNaNs, are simply propagated through computation until they appear in human-consumable final results. The most significant bit of the mantissa field distinguishes between the two kinds of NaNs: it is cleared for SNaNs and set for QNaNs.
Additionally, IEEE-754 supports, but does not require, the concept of NaN "payload", i.e. multiple NaN encodings with system- or user-defined meanings. For example, "PowerPC Numerics" (Apple 1994), specifies for the Macintosh system that the 8th through 15th most significant bits of the fraction field of a NaN contain a NaN code which indicates the different origins of NaNs, e.g. sqrt()
of a negative number other than zero, log()
of a negative number, invalid argument to an inverse trigonometric functions such as asin()
. The concept was already used by the SANE (Standard Apple Numerical Environment) introduced with the Apple II, as described in "Apple Numerics Manual, Second Edition" (Apple 1988).
The C and C++ standards provide a standard function nan()
via math.h
/ cmath
that can be used to construct NaN payloads from a string argument in an implementation-defined manner. For a brief description see for example here.
回答2:
To put it otherwise, it perhaps wastes 0.048 % in storage, but saves ten times as much in simpler chip design and power efficiency. I think it's a pretty good deal.
And what this means in practice is that the largest representable number is ~
179769313486231570000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
359538626972463140000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
so these "wasted" values wouldn't be that useful anyway.
来源:https://stackoverflow.com/questions/40785756/is-ieee-754-floating-point-representation-wasting-memory