According to N1570 (C11 draft) 6.5.6/8
Additive operators:
Moreover, if the expression
P
points to the last element of an array object, the expression(P)+1
points one past the last element of the array object, and if the expressionQ
points one past the last element of an array object, the expression(Q)-1
points to the last element of the array object
Subclause 6.5.6/9
also contains:
Moreover, if the expression
P
points either to an element of an array object or one past the last element of an array object, and the expressionQ
points to the last element of the same array object, the expression((Q)+1)-(P)
has the same value as((Q)-(P))+1
and as-((P)-((Q)+1))
, and has the value zero if the expressionP
points one past the last element of the array object, even though the expression(Q)+1
does not point to an element of the array object.106)
This justifies pointer's arithmetic like this to be valid:
#include <stdio.h>
int main(void)
{
int a[3] = {0, 1, 2};
int *P, *Q;
P = a + 3; // one past the last element
Q = a + 2; // last element
printf("%td\n", ((Q)+1)-(P));
printf("%td\n", ((Q)-(P))+1);
printf("%td\n", -((P)-((Q)+1)));
return 0;
}
I would expect to disallow pointing to element of array out-of-bounds, for which dereference acts as undefined behaviour (array overrun), thus it makes it potentially dangerous. Is there any rationale for this?
Specifying the range to loop over as the half-closed interval [start, end)
, especially for array indices, has certain pleasing properties as Dijkstra observed in one of his notes.
1) You can compute the size of the range as a simple function of end - start
. In particular, if the range is specified in terms of array indices, the number of iterations performed by the loop would be given by end - start
. If the range was [start, end]
, then the number of iterations would have been end - start + 1
- very annoying, isn't it? :)
2) Dijsktra's second observation applies only to the case of (non-negative) integral indices - specifying a range as [start, end)
and (start, end]
both have the property mentioned in 1). However, specifying it as (start, end]
requires you to allow an index of -1
to represent a loop range including the index 0
- you are allowing an "unnatural" value of -1
just for the sake of representing the range. The [start, end)
convention does not have this issue, because end
is a non-negative integer, and hence a natural choice when dealing with array indices.
Dijsktra's objection to allowing -1
does have similarities to allowing one past the last valid address of the container. However, since the above convention has been in use for so long, it likely persuaded the standards committee to make this exception.
The rationale is quite simple. The compiler is not allowed to place an array at the end of memory. To illustrate, assume that we have a 16-bit machine with 16-bit pointers. The low address is 0x0000. The high address is 0xffff. If you declare char array[256]
and the compiler locates array
at address 0xff00
, then technically the array would fit into the memory, using addresses 0xff00
thru 0xffff
inclusive. However, the expression
char *endptr = &array[256]; // endptr points one past the end of the array
would be equivalent to
char *endptr = NULL; // &array[256] = 0xff00 + 0x0100 = 0x0000
Which means that the following loop would not work, since ptr
will never be less than 0
for ( char *ptr = array; ptr < endptr; ptr++ )
So the sections you cited are simply lawyer-speak for, "Don't put arrays at the end of a memory region".
Historical note: the earliest x86 processors used a segmented memory scheme wherein memory addresses where specified by a 16-bit pointer register and a 16-bit segment register. The final address was computed by shifting the segment register left by 4 bits and adding to the pointer, e.g.
pointer register 1234
segment register AB00
-----
address in memory AC234
The resulting address space was 1MByte, but there were end-of-memory boundaries every 64Kbytes. That's one reason for using lawyer-speak instead of stating, "Don't put arrays at the end of memory" in plain english.
来源:https://stackoverflow.com/questions/27472531/what-is-the-rationale-for-one-past-the-last-element-of-an-array-object