Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not?

后端 未结 13 1377
傲寒
傲寒 2020-11-22 06:34

I have seen it asserted several times now that the following code is not allowed by the C++ Standard:

int array[5];
int *array_begin = &array[0];
int *ar         


        
13条回答
  •  悲&欢浪女
    2020-11-22 06:53

    This is legal:

    int array[5];
    int *array_begin = &array[0];
    int *array_end = &array[5];
    

    Section 5.2.1 Subscripting The expression E1[E2] is identical (by definition) to *((E1)+(E2))

    So by this we can say that array_end is equivalent too:

    int *array_end = &(*((array) + 5)); // or &(*(array + 5))
    

    Section 5.3.1.1 Unary operator '*': The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points. If the type of the expression is “pointer to T,” the type of the result is “T.” [ Note: a pointer to an incomplete type (other than cv void) can be dereferenced. The lvalue thus obtained can be used in limited ways (to initialize a reference, for example); this lvalue must not be converted to an rvalue, see 4.1. — end note ]

    The important part of the above:

    'the result is an lvalue referring to the object or function'.

    The unary operator '*' is returning a lvalue referring to the int (no de-refeference). The unary operator '&' then gets the address of the lvalue.

    As long as there is no de-referencing of an out of bounds pointer then the operation is fully covered by the standard and all behavior is defined. So by my reading the above is completely legal.

    The fact that a lot of the STL algorithms depend on the behavior being well defined, is a sort of hint that the standards committee has already though of this and I am sure there is a something that covers this explicitly.

    The comment section below presents two arguments:

    (please read: but it is long and both of us end up trollish)

    Argument 1

    this is illegal because of section 5.7 paragraph 5

    When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

    And though the section is relevant; it does not show undefined behavior. All the elements in the array we are talking about are either within the array or one past the end (which is well defined by the above paragraph).

    Argument 2:

    The second argument presented below is: * is the de-reference operator.
    And though this is a common term used to describe the '*' operator; this term is deliberately avoided in the standard as the term 'de-reference' is not well defined in terms of the language and what that means to the underlying hardware.

    Though accessing the memory one beyond the end of the array is definitely undefined behavior. I am not convinced the unary * operator accesses the memory (reads/writes to memory) in this context (not in a way the standard defines). In this context (as defined by the standard (see 5.3.1.1)) the unary * operator returns a lvalue referring to the object. In my understanding of the language this is not access to the underlying memory. The result of this expression is then immediately used by the unary & operator operator that returns the address of the object referred to by the lvalue referring to the object.

    Many other references to Wikipedia and non canonical sources are presented. All of which I find irrelevant. C++ is defined by the standard.

    Conclusion:

    I am wiling to concede there are many parts of the standard that I may have not considered and may prove my above arguments wrong. NON are provided below. If you show me a standard reference that shows this is UB. I will

    1. Leave the answer.
    2. Put in all caps this is stupid and I am wrong for all to read.

    This is not an argument:

    Not everything in the entire world is defined by the C++ standard. Open your mind.

提交回复
热议问题