Is IEEE 754 floating point representation wasting memory?

爷,独闯天下 提交于 2020-01-05 04:35:29

问题


I always thought that there are 2^64 different fractional values that can be stored by a variable of type double. (Each bit can have either 1 or 0 as value and so 2^64 different values).

Recently I came to know that NaN (not a number) has a representation in which exponent part is 11111111111 and significand part is any non-zero value. Instead, if it were like the representation is NaN if exponent part is 11111111111 and significand part is 111111......(52 times) ?

Won't this allow us to represent 2^52 more different numbers? And 2^52 is a huge number. So are we not wasting the valuable space?


回答1:


The IEEE-754 floating-point formats were designed with efficient hardware implementation in mind. All the special input operands can be detected by examining the exponent field only, which is either all-0 (zeros and denormals), or all-1 (infinities and NaNs). So for double precision specifically, only a 11-bit comparator is required, and the check can be performed in a fraction of a processor cycle.

Reserving one of 2048 possible exponent encodings for infinities and NaNs is not particularly wasteful. Note that IEEE-754 uses two different kind of NaNs: Signalling NaNs, or SNaNs, trigger an exception when encountered, while quite NaNs, or QNaNs, are simply propagated through computation until they appear in human-consumable final results. The most significant bit of the mantissa field distinguishes between the two kinds of NaNs: it is cleared for SNaNs and set for QNaNs.

Additionally, IEEE-754 supports, but does not require, the concept of NaN "payload", i.e. multiple NaN encodings with system- or user-defined meanings. For example, "PowerPC Numerics" (Apple 1994), specifies for the Macintosh system that the 8th through 15th most significant bits of the fraction field of a NaN contain a NaN code which indicates the different origins of NaNs, e.g. sqrt() of a negative number other than zero, log() of a negative number, invalid argument to an inverse trigonometric functions such as asin(). The concept was already used by the SANE (Standard Apple Numerical Environment) introduced with the Apple II, as described in "Apple Numerics Manual, Second Edition" (Apple 1988).

The C and C++ standards provide a standard function nan() via math.h / cmath that can be used to construct NaN payloads from a string argument in an implementation-defined manner. For a brief description see for example here.




回答2:


To put it otherwise, it perhaps wastes 0.048 % in storage, but saves ten times as much in simpler chip design and power efficiency. I think it's a pretty good deal.

And what this means in practice is that the largest representable number is ~

179769313486231570000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

359538626972463140000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

so these "wasted" values wouldn't be that useful anyway.



来源:https://stackoverflow.com/questions/40785756/is-ieee-754-floating-point-representation-wasting-memory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!