It looks to me like the following program computes an invalid pointer, since NULL
is no good for anything but assignment and comparison for equality:
#include <stdlib.h>
#include <stdio.h>
int main() {
char *c = NULL;
c--;
printf("c: %p\n", c);
return 0;
}
However, it seems like none of the warnings or instrumentations in GCC or Clang targeted at undefined behavior say that this is in fact UB. Is that arithmetic actually valid and I'm being too pedantic, or is this a deficiency in their checking mechanisms that I should report?
Tested:
$ clang-3.3 -Weverything -g -O0 -fsanitize=undefined -fsanitize=null -fsanitize=address offsetnull.c -o offsetnull
$ ./offsetnull
c: 0xffffffffffffffff
$ gcc-4.8 -g -O0 -fsanitize=address offsetnull.c -o offsetnull
$ ./offsetnull
c: 0xffffffffffffffff
It seems to be pretty well documented that AddressSanitizer as used by Clang and GCC is more focused on dereference of bad pointers, so that's fair enough. But the other checks don't catch it either :-/
Edit: part of the reason that I asked this question is that the -fsanitize
flags enable dynamic checks of well-definedness in the generated code. Is this something they should have caught?
Pointer arithmetic on a pointer not pointing to an array is Undefined behavior.
Also, Dereferencing a NULL pointer is undefined behavior.
char *c = NULL;
c--;
is Undefined defined behavior because c
does not point to an array.
C++11 Standard 5.7.5:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integral expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i + n-th and i − n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
Yes, this is undefined behavior, and is something that -fsanitize=undefined
should have caught; it's already on my TODO list to add a check for this.
FWIW, the C and C++ rules here are slightly different: adding 0
to a null pointer and subtracting one null pointer from another have undefined behavior in C but not in C++. All other arithmetic on null pointers has undefined behavior in both languages.
Not only is arithmetic on a null pointer forbidden, but the failure of implementations which trap attempted dereferences to also trap arithmetic on null pointers greatly degrades the benefit of null-pointer traps.
There is never any situation defined by the Standard where adding anything to a null pointer can yield a legitimate pointer value; further, situations in which implementations could define any useful behavior for such actions are rare and could generally better be handled via compiler intrinsics(*). On many implementations, however, if null-pointer arithmetic isn't trapped, adding an offset to a null pointer can yield a pointer which, while not valid, is no longer recognizable as a null pointer. An attempt to dereference such a pointer would not be trapped, but could trigger arbitrary effects.
Trapping pointer computations of the form (null+offset) and (null-offset) would eliminate this danger. Note that protection would not necessarily require trapping (pointer-null), (null-pointer), or (null-null), while the values returned by the first two expressions would be unlikely to have any usefulness [if an implementation were to specify that null-null would yield zero, code which targeted that particular implementation might sometimes be more efficient than code which had to special-case null
] they would not generate invalid pointers. Further, having (null+0) and (null-0) either yield null pointers rather than trapping would not jeopardize safety and may avoid the need to have user code special-case null pointers, but the advantages would be less compelling since the compiler would have to add extra code to make that happen.
(*) Such an intrinsic on an 8086 compilers, for example, might accept an unsigned 16-bit integers "seg" and "ofs", and read the word at address seg:ofs without a null trap even when address happened to be zero. Address (0x0000:0x0000) on the 8086 is an interrupt vector which some programs may need to access, and while address (0xFFFF:0x0010) accesses the same physical location as (0x0000:0x0000) on older processors with only 20 address lines, it accesses physical location 0x100000 on processors with 24 or more address lines). In some cases an alternative would be to have a special designation for pointers which are expected to point to things not recognized by the C standard (things like the interrupt vectors would qualify) and refrain from null-trapping those, or else to specify that volatile
pointers will be treated in such fashion. I've seen the first behavior in at least one compiler, but don't think I've seen the second.
来源:https://stackoverflow.com/questions/15608366/is-performing-arithmetic-on-a-null-pointer-undefined-behavior