Why doesn't the compiler detect out-of-bounds in string constant initialization?

后端 未结 6 1834
误落风尘
误落风尘 2021-01-18 00:13

I read this question and its answer in a book. But I didn\'t understand the book\'s justification.

Will the following code compile?

相关标签:
6条回答
  • 2021-01-18 00:42

    What's happening is you're trying to initialize a character array with more characters than the array has room for. Here's how it breaks down:

    char str[5];
    

    Declares a character array with five characters.

    char str[5] = "fast enough";
    

    The second part '= "fast enough";' then attempts to initialize that array with the value "fast enough". This will not work, because "fast enough" is longer than the array is.

    It will, however, compile. C and C++ compilers can't generally perform bounds checking on arrays for you, and overrunning an array is one of the most common reasons for segmentation faults. [edit]As Mark Rushakoff pointed out, apparently the newer ones do throw warnings, for some cases.[/edit] This may segfault when you try to run it, more likely I think the array will simply be initialized to "fast ".

    0 讨论(0)
  • 2021-01-18 00:50

    The answer to the question that you quoted is incorrect. The correct answer is "No. The code will not compile", assuming a formally correct C compiler (as opposed to quirks of some specific compiler).

    C language does not allow using an excessively long string literal to initialize a character array of specific size. The only flexibility allowed by the language here is the terminating \0 character. If the array is too short to accommodate the terminating \0, the terminating \0 is silently dropped. But the actual literal string characters cannot be dropped. If the literal is too long, it is a constraint violation and the compiler must issue a diagnostic message.

    char s1[5] = "abc"; /* OK */
    char s2[5] = "abcd"; /* OK */
    char s3[5] = "abcde"; /* OK, zero at the end is dropped (ERROR in C++) */
    char s4[5] = "abcdef"; /* ERROR, initializer is too long (ERROR in C++ as well) */
    

    Whoever wrote your "book" did know what they were talking about (at least on this specific subject). What they state in the answer is flat out incorrect.

    Note: Supplying excessively long string initializers is illegal in C89/90, C99 and C++. However C++ is even more restrictive in this regard. C++ prohibits dropping the terminating \0 character, while C allows dropping it, as described above.

    0 讨论(0)
  • 2021-01-18 00:53

    Your book must be pretty old, because gcc puts out a warning even without -Wall turned on:

    $ gcc c.c
    c.c: In function `main':
    c.c:6: warning: initializer-string for array of chars is too long
    

    If we slightly update the program:

    #include <stdio.h>
    
    int main(int argc, char **argv)
    {
    
            char str[5] = "1234567890";
            printf("%s\n", str);
            return 0;
    }
    

    We can see that gcc seems to truncate the string to the length you've specified; I'm assuming that there happens to be a '\0' where str[6] would be, because otherwise we should see garbage after the 5; but maybe gcc implicitly makes str an array of length 6 and automatically sticks the '\0' in there - I'm not sure.

    $ gcc c.c && ./a.exe
    c.c: In function `main':
    c.c:6: warning: initializer-string for array of chars is too long
    12345
    
    0 讨论(0)
  • 2021-01-18 00:54

    In the C++ standard, 8.5.2/2 Character arrays says:

    There shall not be more initializers than there are array elements.

    In the C99 standard, 6.7.8/2 Initialization says:

    No initializer shall attempt to provide a value for an object not contained within the entity being initialized

    C90 6.5.7 Initializers says similar.

    However, note that for C (both C90 and C99) the '\0' terminating character will be put in the array if there is room. It's not an error if the terminator will not fit (C99 6.7.8/14: "Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array").

    On the other hand, the C++ standard has an example that indicates an error should be diagnosed if there's not room for the terminating character.

    in either case, this should be diagnosed as an error in all compilers:

    char str[5] = "fast enough";
    

    Maybe pre-ANSI compilers weren't so strict, but any reasonably modern compiler should diagnose this.

    0 讨论(0)
  • 2021-01-18 00:56

    Because "fast enough" simply a pointer to a null terminated string. It's too much work for the compiler to figure out if ever assignment to a char* or char [] is going to go beyond the bounds of the array.

    0 讨论(0)
  • 2021-01-18 00:59

    Array-bound checking happens at runtime, not compile time. The compiler has no way of doing the static analysis of the above code that would be necessary to prevent the error.

    UPDATE: Apparently the above statement is true for some compilers and not others. If your book says it will compile, it must be referring to a compiler that doesn't do the checking.

    0 讨论(0)
提交回复
热议问题