Consider following code:
char str[] = \"Hello\\0\";
What is the length of str array, and with how much 0s it is ending?
sizeof str
is 7 - five bytes for the "Hello" text, plus the explicit NUL terminator, plus the implicit NUL terminator.
strlen(str)
is 5 - the five "Hello" bytes only.
The key here is that the implicit nul terminator is always added - even if the string literal just happens to end with \0
. Of course, strlen
just stops at the first \0
- it can't tell the difference.
There is one exception to the implicit NUL terminator rule - if you explicitly specify the array size, the string will be truncated to fit:
char str[6] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 6 (with one NUL)
char str[7] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 7 (with two NULs)
char str[8] = "Hello\0"; // strlen(str) = 5, sizeof(str) = 8 (with three NULs per C99 6.7.8.21)
This is, however, rarely useful, and prone to miscalculating the string length and ending up with an unterminated string. It is also forbidden in C++.
The length of the array is 7, the NUL character \0
still counts as a character and the string is still terminated with an implicit \0
See this link to see a working example
Note that had you declared str
as char str[6]= "Hello\0";
the length would be 6 because the implicit NUL is only added if it can fit (which it can't in this example.)
§ 6.7.8/p14
An array of character type may be initialized by a character string literal, optionally enclosed in braces. Sucessive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.
char str[] = "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[5]= "Hello\0"; /* sizeof == 5, str is "Hello" with no NUL (no longer a C-string, just an array of char). This may trigger compiler warning */
char str[6]= "Hello\0"; /* sizeof == 6, Explicit NUL only */
char str[7]= "Hello\0"; /* sizeof == 7, Explicit + Implicit NUL */
char str[8]= "Hello\0"; /* sizeof == 8, Explicit + two Implicit NUL */
Banging my usual drum solo of JUST TRY IT, here's how you can answer questions like that in the future:
$ cat junk.c
#include <stdio.h>
char* string = "Hello\0";
int main(int argv, char** argc)
{
printf("-->%s<--\n", string);
}
$ gcc -S junk.c
$ cat junk.s
... eliding the unnecessary parts ...
.LC0:
.string "Hello"
.string ""
...
.LC1:
.string "-->%s<--\n"
...
Note here how the string I used for printf is just "-->%s<---\n"
while the global string is in two parts: "Hello"
and ""
. The GNU assembler also terminates strings with an implicit NUL
character, so the fact that the first string (.LC0) is in those two parts indicates that there are two NUL
s. The string is thus 7 bytes long. Generally if you really want to know what your compiler is doing with a certain hunk of code, isolate it in a dummy example like this and see what it's doing using -S
(for GNU -- MSVC has a flag too for assembler output but I don't know it off-hand). You'll learn a lot about how your code works (or fails to work as the case may be) and you'll get an answer quickly that is 100% guaranteed to match the tools and environment you're working in.
What is the length of str array, and with how much 0s it is ending?
Let's find out:
int main() {
char str[] = "Hello\0";
int length = sizeof str / sizeof str[0];
// "sizeof array" is the bytes for the whole array (must use a real array, not
// a pointer), divide by "sizeof array[0]" (sometimes sizeof *array is used)
// to get the number of items in the array
printf("array length: %d\n", length);
printf("last 3 bytes: %02x %02x %02x\n",
str[length - 3], str[length - 2], str[length - 1]);
return 0;
}
char str[]= "Hello\0";
That would be 7 bytes.
In memory it'd be:
48 65 6C 6C 6F 00 00
H e l l o \0 \0
Edit:
What does the \0 symbol mean in a C string?
It's the "end" of a string. A null character. In memory, it's actually a Zero. Usually functions that handle char arrays look for this character, as this is the end of the message. I'll put an example at the end.
What is the length of str array? (Answered before the edit part)
7
and with how much 0s it is ending?
You array has two "spaces" with zero; str[5]=str[6]='\0'=0
Extra example:
Let's assume you have a function that prints the content of that text array.
You could define it as:
char str[40];
Now, you could change the content of that array (I won't get into details on how to), so that it contains the message: "This is just a printing test" In memory, you should have something like:
54 68 69 73 20 69 73 20 6a 75 73 74 20 61 20 70 72 69 6e 74
69 6e 67 20 74 65 73 74 00 00 00 00 00 00 00 00 00 00 00 00
So you print that char array. And then you want a new message. Let's say just "Hello"
48 65 6c 6c 6f 00 73 20 6a 75 73 74 20 61 20 70 72 69 6e 74
69 6e 67 20 74 65 73 74 00 00 00 00 00 00 00 00 00 00 00 00
Notice the 00 on str[5]. That's how the print function will know how much it actually needs to send, despite the actual longitude of the vector and the whole content.
Specifically, I want to mention one situation, by which you may confuse.
What is the difference between "\0" and ""?
The answer is that "\0"
represents in array is {0 0}
and ""
is {0}
.
Because "\0"
is still a string literal and it will also add "\0"
at the end of it. And ""
is empty but also add "\0"
.
Understanding of this will help you understand "\0"
deeply.