IEEE 754: How exactly does it work?

末鹿安然 提交于 2019-12-04 10:46:20

The value to be represented would be 2147483647. the next two values which can be represented this way are 2147483520 and 2147483648.

As the latter is closer to the unrepresentable "ideal one", it gets used: in floating point, the values get rounded, not truncated.

The standard is available here. You might have to purchase it, as IEEE (and other organizations like it) mainly make their money by selling the standard, to defray their costs in assembling, lobbying for acceptance, and improving the quality of the standard.

The bits only mean what someone designates them to be

"When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean -- neither more nor less." "The question is," said Alice, "whether you can make words mean so many different things." "The question is," said Humpty Dumpty, "which is to be master - - that's all." (Through the Looking Glass, Chapter 6)

In this case IEEE has decided what the bits mean, and the reason that the printf flag %f prints out the right corresponding human representation is due to the flag also following the same standard.

Occasionally you can manage to cast the bits into another data type (like an int) and print out the "other" representation of those bits. C will catch a lot of the normal number promotions, but you can confuse it, generally with the assistance of assigning pointer of the wrong type to the correct address (and dereferencing them).

Note that while you are doing the math by hand, the actual hardware isn't guaranteed to do the math exactly as you would. With integer math there is much more accuracy in the representation, but with floating point math, how you round a number makes a big difference in the output. That's not even mentioning the floating point errors which sometimes were burned into systems (thankfully not often).

Floating point formats are often in a "normalized form" where the most significant bit of the mantissa is always 1. Since it's always 1, you don't need to use up a bit to store it. So when decoding such a number representation, you'll need to add back the 1 at the top.

2147483647 = 2^31 - 1 = +1 * 2^30 * 1.1111 1111 1111 1111 1111 1111 1111 11

When encoding this number in the IEEE 754-1985 single precision format, the significand is rounded properly. For the rounding mode round to nearest even (the default rounding mode) this means it gets rounded up.

Before rounding:

exponent = 30, significand = 1.1111 1111 1111 1111 1111 1111 1111 11

After rounding the significand to 23 digits after the decimal point:

exponent = 30, significand = 10.0000 0000 0000 0000 0000 000

After normalizing:

exponent = 31, significand = 1.0

Encoded in the single precision format:

1 | 10011110 | 00000000000000000000000
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!