Potential problem with C standard malloc'ing chars

前端 未结 3 1646
感情败类
感情败类 2020-12-31 20:49

When answering a comment to another answer of mine here, I found what I think may be a hole in the C standard (c1x, I haven\'t checked the earlier ones and yes, I k

相关标签:
3条回答
  • 2020-12-31 20:54

    Aren't the units of "size_t sz" in whatever the addressable unit of your architecture is? I work with a DSP whose addresses correspond to 32-bit values, not bytes. malloc(1) gets me a pointer to a 4-byte area.

    0 讨论(0)
  • 2020-12-31 21:04

    In a 16-bit char environment malloc(10 * sizeof(char)) will allocate 10 chars (10 bytes), because if char is 16 bits, then that architecture/implementation defines a byte as 16 bits. A char isn't an octet, it's a byte. On older computers this can be larger than the 8 bit de-facto standard we have today.

    The relevant section from the C standard follows:

    3.6 Terms, definitions and symbols

    byte - addressable unit of data storage large enough to hold any member of the basic character set of the execution environment...

    NOTE 2 - A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined.

    0 讨论(0)
  • 2020-12-31 21:09

    In the C99 standard the rigorous correlation between bytes, char, and object size is given in 6.2.6.1/4 "Representations of types - General":

    Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value.

    In the C++ standard the same relationship is given in 3.9/2 "Types":

    For any object (other than a base-class subobject) of POD type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.

    In C90 there doesn't appear to be as explicitly mentioned correlation, but between the definition of a byte, the definition of a character, and the definition of the sizeof operator the inference can be made that a char type is equivalent to a byte.

    Also note that the number of bits in a byte (and the number of bits in a char) is implementation defined—strictly speaking it doesn't need to be 8 bits. And onebyone points out in a comment elsewhere that DSPs commonly have bytes with a number of bits that isn't 8.

    Note that IETF RFCs and standards generally (always?) use the term 'octect' instead of 'byte' to be unambiguous that the units they're talking about have exactly 8 bits - no more, no less.

    0 讨论(0)
提交回复
热议问题