Why cast to a pointer then dereference?

后端 未结 5 816
心在旅途
心在旅途 2020-12-18 19:06

I was going through this example which has a function outputting a hex bit pattern to represent an arbitrary float.

void ExamineFloat(float fValue)
{
    pr         


        
相关标签:
5条回答
  • 2020-12-18 19:18
    (unsigned long)fValue
    

    This converts the float value to an unsigned long value, according to the "usual arithmetic conversions".

    *(unsigned long *)&fValue
    

    The intention here is to take the address at which fValue is stored, pretend that there is not a float but an unsigned long at this address, and to then read that unsigned long. The purpose is to examine the bit pattern which is used to store the float in memory.

    As shown, this causes undefined behavior though.

    Reason: You may not access an object through a pointer to a type that is not "compatible" to the object's type. "Compatible" types are for example (unsigned) char and every other type, or structures that share the same initial members (speaking of C here). See §6.5/7 N1570 for the detailed (C11) list (Note that my use of "compatible" is different - more broad - than in the referenced text.)

    Solution: Cast to unsigned char *, access the individual bytes of the object and assemble an unsigned long out of them:

    unsigned long pattern = 0;
    unsigned char * access = (unsigned char *)&fValue;
    for (size_t i = 0; i < sizeof(float); ++i) {
      pattern |= *access;
      pattern <<= CHAR_BIT;
      ++access;
    }
    

    Note that (as @CodesInChaos pointed out) the above treats the floating point value as being stored with its most significant byte first ("big endian"). If your system uses a different byte order for floating point values you'd need to adjust to that (or rearrange the bytes of above unsigned long, whatever's more practical to you).

    0 讨论(0)
  • 2020-12-18 19:29

    Floating-point values have memory representations: for example the bytes can represent a floating-point value using IEEE 754.

    The first expression *(unsigned long *)&fValue will interpret these bytes as if it was the representation of an unsigned long value. In fact in C standard it results in an undefined behavior (according to the so-called "strict aliasing rule"). In practice, there are issues such as endianness that have to be taken into account.

    The second expression (unsigned long)fValue is C standard compliant. It has a precise meaning:

    C11 (n1570), § 6.3.1.4 Real floating and integer

    When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

    0 讨论(0)
  • 2020-12-18 19:38

    Typecasting in C does both a type conversion and a value conversion. The floating point → unsigned long conversion truncates the fractional portion of the floating point number and restricts the value to the possible range of an unsigned long. Converting from one type of pointer to another has no required change in value, so using the pointer typecast is a way to keep the same in-memory representation while changing the type associated with that representation.

    In this case, it's a way to be able to output the binary representation of the floating point value.

    0 讨论(0)
  • 2020-12-18 19:39

    As others have already noted, casting a pointer to a non-char type to a pointer to a different non-char type and then dereferencing is undefined behavior.

    That printf("%08lx\n", *(unsigned long *)&fValue) invokes undefined behavior does not necessarily mean that running a program that attempts to perform such a travesty will result in hard drive erasure or make nasal demons erupt from ones nose (the two hallmarks of undefined behavior). On a computer in which sizeof(unsigned long)==sizeof(float) and on which both types have the same alignment requirements, that printf will almost certainly do what one expects it to do, which is to print the hex representation of the floating point value in question.

    This shouldn't be surprising. The C standard openly invites implementations to extend the language. Many of these extensions are in areas that are, strictly speaking, undefined behavior. For example, the POSIX function dlsym returns a void*, but this function is typically used to find the address of a function rather than a global variable. This means the void pointer returned by dlsym needs to be cast to a function pointer and then dereferenced to call the function. This is obviously undefined behavior, but it nonetheless works on any POSIX compliant platform. This will not work on a Harvard architecture machine on which pointers to functions have different sizes than do pointers to data.

    Similarly, casting a pointer to a float to a pointer to an unsigned integer and then dereferencing happens to work on almost any computer with almost any compiler in which the size and alignment requirements of that unsigned integer are the same as that of a float.

    That said, using unsigned long might well get you into trouble. On my computer, an unsigned long is 64 bits long and has 64 bit alignment requirements. This is not compatible with a float. It would be better to use uint32_t -- on my computer, that is.


    The union hack is one way around this mess:

    typedef struct {
        float fval;
        uint32_t ival;
    } float_uint32_t;
    

    Assigning to a float_uint32_t.fval and accessing from a ``float_uint32_t.ival` used to be undefined behavior. That is no longer the case in C. No compiler that I know of blows nasal demons for the union hack. This was not UB in C++. It was illegal. Until C++11, a compliant C++ compiler had to complain to be compliant.


    Any even better way around this mess is to use the %a format, which has been part of the C standard since 1999:

    printf ("%a\n", fValue);
    

    This is simple, easy, portable, and there is no chance of undefined behavior. This prints the hexadecimal/binary representation of the double precision floating point value in question. Since printf is an archaic function, all float arguments are converted to double prior to the call to printf. This conversion must be exact per the 1999 version of the C standard. One can pick up that exact value via a call to scanf or its sisters.

    0 讨论(0)
  • 2020-12-18 19:43

    *(unsigned long *)&fValue is not equivalent to a direct cast to an unsigned long.

    The conversion to (unsigned long)fValue converts the value of fValue into an unsigned long, using the normal rules for conversion of a float value to an unsigned long value. The representation of that value in an unsigned long (for example, in terms of the bits) can be quite different from how that same value is represented in a float.

    The conversion *(unsigned long *)&fValue formally has undefined behaviour. It interprets the memory occupied by fValue as if it is an unsigned long. Practically (i.e. this is what often happens, even though the behaviour is undefined) this will often yield a value quite different from fValue.

    0 讨论(0)
提交回复
热议问题