Reading double to platform endianness with union and bit shift, is it safe?

末鹿安然 提交于 2019-12-05 20:13:39

Reinterpreting Through a Union

Constructing a uint64_t value by shifting and ORing bytes is of course supported by the C standard. (There is some hazard when shifting due to the need to ensure the left operand is the correct size and type to avoid issues with overflow and shift width, but the code in the question correctly converts to uint64_t before shifting.) Then the question remaining for the code is whether reinterpreting through a union is permitted by the C standard. The answer is yes.

C 6.5.2.3 3 says:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member,99)

and note 99 says:

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called "type punning")…

Such reinterpretation of course relies on the object representations used in the C implementation. Notably the double must use the expected format, matching the bytes read from the input stream.

Modifying the Bytes of an Object

Modifying an object by modifying its bytes (as by using a pointer to unsigned char) is permitted by C. C 2018 6.5 7 says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [list of various types], or a character type.

Although one of the comments states that you may “access” but not “modify” the bytes of an object this way (apparently interpreting “access” to mean only reading, not writing), C 2018 3.1 defines “access” as:

to read or modify the value of an object.

Thus, one is permitted to read or write the bytes of an object through character types.

Reading double to platform endianness with union and bit shift, is it safe?

This kind of thing only makes sense when dealing with data from outside the program (e.g. data from a file or network); where you have a strict format for the data (defined in the file format's specification or the network protocol's specification) that may have nothing to do with the format C uses, may have nothing to do with the CPU uses and may not be IEEE 754 format either.

On the other side C doesn't provide any guarantees at all. For a simple example, it's perfectly legal for the compiler to use a BCD format for float where 0x12345e78 = 1.2345 * 10**78, even if the CPU itself happens to support "IEEE 754".

The result is you have "whatever the spec says format" from outside the program and you're converting that into a different "whatever the compiler felt like format" for use inside the program; and every single assumption you've made (including sizeof(double)) is potentially false.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!