A colleague of mine recently got bitten badly by writing out of bounds to a static array on the stack (he added an element to it without increasing the array size). Shouldn
GCC does warn about this. But you need to do two things:
a
is, and that you ran off the edge..
$ cat foo.c
int main(void)
{
int a[10];
a[13] = 3; // oops, overwrote the return address
return a[1];
}
$ gcc -Wall -Wextra -O2 -c foo.c
foo.c: In function ‘main’:
foo.c:4: warning: array subscript is above array bounds
BTW: If you returned a[13] in your test program, that wouldn't work either, as GCC optimizes out the array again.
-fbounds-checking option is available with gcc.
worth going thru this article http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html
'le dorfier' has given apt answer to your question though, its your program and it is the way C behaves.
You're right, the behavior is undefined. C99 pointers must point within or just one element beyond declared or heap-allocated data structures.
I've never been able to figure out how the gcc
people decide when to warn. I was shocked to learn that -Wall
by itself will not warn of uninitialized variables; at minimum you need -O
, and even then the warning is sometimes omitted.
I conjecture that because unbounded arrays are so common in C, the compiler probably doesn't have a way in its expression trees to represent an array that has a size known at compile time. So although the information is present at the declaration, I conjecture that at the use it is already lost.
I second the recommendation of valgrind. If you are programming in C, you should run valgrind on every program, all the time until you can no longer take the performance hit.
Have you tried -fmudflap
with GCC? These are runtime checks but are useful, as most often you have got to do with runtime calculated indices anyway. Instead of silently continue to work, it will notify you about those bugs.
-fmudflap -fmudflapth -fmudflapir
For front-ends that support it (C and C++), instrument all risky pointer/array dereferencing operations, some standard library string/heap functions, and some other associated constructs with range/validity tests. Modules so instrumented should be immune to buffer overflows, invalid heap use, and some other classes of C/C++ programming errors. The instrumen‐ tation relies on a separate runtime library (libmudflap), which will be linked into a program if -fmudflap is given at link time. Run-time behavior of the instrumented program is controlled by the MUDFLAP_OPTIONS environment variable. See "env MUDFLAP_OPTIONS=-help a.out" for its options.Use -fmudflapth instead of -fmudflap to compile and to link if your program is multi-threaded. Use -fmudflapir, in addition to -fmudflap or -fmudflapth, if instrumentation should ignore pointer reads. This produces less instrumentation (and there‐ fore faster execution) and still provides some protection against outright memory corrupting writes, but allows erroneously read data to propagate within a program.
Here is what mudflap gives me for your example:
[js@HOST2 cpp]$ gcc -fstack-protector-all -fmudflap -lmudflap mudf.c
[js@HOST2 cpp]$ ./a.out
*******
mudflap violation 1 (check/write): time=1229801723.191441 ptr=0xbfdd9c04 size=56
pc=0xb7fb126d location=`mudf.c:4:3 (main)'
/usr/lib/libmudflap.so.0(__mf_check+0x3d) [0xb7fb126d]
./a.out(main+0xb9) [0x804887d]
/usr/lib/libmudflap.so.0(__wrap_main+0x4f) [0xb7fb0a5f]
Nearby object 1: checked region begins 0B into and ends 16B after
mudflap object 0x8509cd8: name=`mudf.c:3:7 (main) a'
bounds=[0xbfdd9c04,0xbfdd9c2b] size=40 area=stack check=0r/3w liveness=3
alloc time=1229801723.191433 pc=0xb7fb09fd
number of nearby objects: 1
[js@HOST2 cpp]$
It has a bunch of options. For example it can fork off a gdb process upon violations, can show you where your program leaked (using -print-leaks
) or detect uninitialized variable reads. Use MUDFLAP_OPTIONS=-help ./a.out
to get a list of options. Since mudflap only outputs addresses and not filenames and lines of the source, i wrote a little gawk script:
/^ / {
file = gensub(/([^(]*).*/, "\\1", 1);
addr = gensub(/.*\[([x[:xdigit:]]*)\]$/, "\\1", 1);
if(file && addr) {
cmd = "addr2line -e " file " " addr
cmd | getline laddr
print $0 " (" laddr ")"
close (cmd)
next;
}
}
1 # print all other lines
Pipe the output of mudflap into it, and it will display the sourcefile and line of each backtrace entry.
Also -fstack-protector[-all]
:
-fstack-protector
Emit extra code to check for buffer overflows, such as stack smashing attacks. This is done by adding a guard variable to functions with vulnerable objects. This includes functions that call alloca, and functions with buffers larger than 8 bytes. The guards are initialized when a function is entered and then checked when the function exits. If a guard check fails, an error message is printed and the program exits.
-fstack-protector-all
Like -fstack-protector except that all functions are protected.
It's not a static array.
Undefined behavior or not, it's writing to an address 13 integers from the beginning of the array. What's there is your responsibility. There are several C techniques that intentionally misallocate arrays for reasonable reasons. And this situation is not unusual in incomplete compilation units.
Depending on your flag settings, there are a number of features of this program that would be flagged, such as the fact that the array is never used. And the compiler might just as easily optimize it out of existence and not tell you - a tree falling in the forest.
It's the C way. It's your array, your memory, do what you want with it. :)
(There are any number of lint tools for helping you find this sort of thing; and you should use them liberally. They don't all work through the compiler though; Compiling and linking are often tedious enough as it is.)
I believe that some compilers do in certain cases. For example, if my memory serves me correctly, newer Microsoft compilers have a "Buffer Security Check" option which will detect trivial cases of buffer overruns.
Why don't all compilers do this? Either (as previously mentioned) the internal representation used by the compiler doesn't lend itself to this type of static analysis or it just isn't high enough of the writers priority list. Which to be honest, is a shame either way.