What happened when we do not include '\0' at the end of string in C?

做~自己de王妃 提交于 2019-11-27 03:42:19

问题


In C, when I initialize my array this way:

char full_name[] = {
    't', 'o', 'a', 'n'
};

and print it with printf("%s", full_name);

and run it with valgrind I got error

Uninitialised value was create by stack allocation

Why do that happen?


回答1:


Since %s format specifier expects a null-terminated string, the resulting behavior of your code is undefined. Your program is considered ill-formed, and can produce any output at all, produce no output, crash, and so on. To put this shortly, don't do that.

This is not to say that all arrays of characters must be null-terminated: the rule applies only to arrays of characters intended to use as C strings, e.g. to be passed to printf on %s format specifier, or to be passed to strlen or other string functions of the Standard C library.

If you are intended to use your char array for something else, it does not need to be null terminated. For example, this use is fully defined:

char full_name[] = {
    't', 'o', 'a', 'n'
};
for (size_t i = 0 ; i != sizeof(full_name) ; i++) {
    printf("%c", full_name[i]);
}



回答2:


If you do not provide the '\0' at the end for the comma separated brace enclosed initializer list, technically, full_name is not a string, as the char array is not null-terminated.

Just to clear things out a bit, unlike the initializer being string literal, a comma separated list does not automatically count and put the terminating null character into the array.

So, in case of a definition like

char full_name[] = {
    't', 'o', 'a', 'n'
};

the size of the array is 4 and it has 't', 'o', 'a', 'n' into it.

OTOH, in case of

char full_name[] = "toan";

full_name will be of size 5 and will contain 't', 'o', 'a', 'n' and '\0'into it.

When you try to make use of the former array with any function operating on strings (i.e., expects a null-terminated char array), you'll get undefined behavior as most of the string functions will go out of bound in search for the null-terminator.

In your particular example, for %s format specifier with printf(), quoting the C11 standard, chapter §7.21.6.1, fprintf() function description (emphasis mine)

s
If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.280)Characters from the array are written up to (but not including) the terminating null character. If the precision is specified, no more than that many bytes are written. If the precision is not specified or is greater than the size of the array, the array shall contain a null character.

That means, the printf() will look for a null-terminator to mark/understand the end of the array. In your example, the lack of the null-terminator will cause printf() to go beyond the allocated memory (full_name[3]) and access out-of-bound memory (full_name[4]) which will cause the UB.




回答3:


If you use a non-null-terminated char sequence as a string, C functions will just keep going. It's the '\0' that tells them to stop. So, whatever happens to be in memory after the sequence will be taken as part of the string. This may eventually cross a memory boundary and cause an error, or it may just print gibberish if it happens to find a '\0' somewhere and stop.




回答4:


printf will interpret "%s" as a standard C string. This means that the code that is generated will simply keep reading characters until it finds a null terminator (\0).

Often this will mean this wandering pointer will venture into uncharted memory and Valgrind will notice this as an error.

You have to explicitly add your own null terminator when initialising a char array, if you intend to use it as a string at some point.




回答5:


Before passing the instruction pointer to a function expecting a c string you are implicitly entering a legally binding contract with that code block. In the primary section of this contract both parties agree to refrain from exchanging dedicated string length information and assert that all passed parameters declared as strings point to a sequence of characters terminated by \0 which gives each party the option to calculate the length.

If you don't include a terminating \0 you will commit a fundamental breach of contract.

The OS court will randomly sue your executable with madness or even death.



来源:https://stackoverflow.com/questions/34995106/what-happened-when-we-do-not-include-0-at-the-end-of-string-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!