In many code samples, people usually use \'\\0\'
after creating a new char array like this:
string s = \"JustAString\";
char* array = new char[
The title of your question references C strings. C++ std::string
objects are handled differently than standard C strings. \0
is important when using C strings, and when I use the term string
here, I'm referring to standard C strings.
\0
acts as a string terminator in C. It is known as the null character, or NUL. It signals code that processes strings - standard libraries but also your own code - where the end of a string is. A good example is strlen
which returns the length of a string.
When you declare a constant string with:
const char *str = "JustAString";
then the \0
is appended automatically for you. In other cases, where you'll be managing a non-constant string as with your array example, you'll sometimes need to deal with it yourself. The docs for strncpy, which is used in your example, are a good illustration: strncpy
copies over the null termination characters except in the case where the specified length is reached before the entire string is copied. Hence you'll often see strncpy
combined with the possibly redundant assignment of a null terminator. strlcpy
and strcpy_s
were designed to address the potential problems that arise from neglecting to handle this case.
In your particular example, array[s.size()] = '\0';
is one such redundancy: since array
is of size s.size() + 1
, and strncpy
is copying s.size()
characters, the function will append the \0
.
The documentation for standard C string utilities will indicate when you'll need to be careful to include such a null terminator. But read the documentation carefully: as with strncpy
the details are easily overlooked, leading to potential buffer overflows.
Why are strings in C++ usually terminated with
'\0'
?
Note that C++ Strings and C strings are not the same.
In C++ string refers to std::string which is a template class and provides a lot of intuitive functions to handle the string.
Note that C++ std::string are not \0
terminated, but the class provides functions to fetch the underlying string data as \0
terminated c-style string.
In C a string is collection of characters. This collection usually ends with a \0
.
Unless a special character like \0
is used there would be no way of knowing when a string ends.
It is also aptly known as the string null terminator.
Ofcourse, there could be other ways of bookkeeping to track the length of the string, but using a special character has two straight advantages:
Note that \0
is needed because most of Standard C library functions operate on strings assuming they are \0
terminated.
For example:
While using printf()
if you have an string which is not \0
terminated then printf()
keeps writing characters to stdout
until a \0
is encountered, in short it might even print garbage.
Why should we use
'\0'
here?
There are two scenarios when you do not need to \0
terminate a string:
\0
to strings. In your case you already have the second scenario working for you.
array[s.size()] = '\0';
The above code statement is redundant in your example.
For your example using strncpy()
makes it useless. strncpy()
copies s.size()
characters to your array
, Note that it appends a null termination if there is any space left after copying the strings. Since array
is of size s.size() + 1
a \0
is automagically added.
strncpy(array, s.c_str(), s.size());
array[s.size()] = '\0';
Why should we use '\0' here?
You shouldn't, that second line is waste of space. strncpy already adds a null termination if you know how to use it. The code can be rewritten as:
strncpy(array, s.c_str(), s.size()+1);
strncpy is sort of a weird function, it assumes that the first parameter is an array of the size of the third parameter. So it only copies null termination if there is any space left after copying the strings.
You could also have used memcpy() in this case, it will be slightly more efficient, though perhaps makes the code less intuitive to read.
In C, we represent string with an array of char (or w_char), and use special character to signal the end of the string. As opposed to Pascal, which stores the length of the string in the index 0 of the array (thus the string has a hard limit on the number of characters), there is theoretically no limit on the number of characters that a string (represented as array of characters) can have in C.
The special character is expected to be NUL in all the functions from the default library in C, and also other libraries. If you want to use the library functions that relies on the exact length of the string, you must terminate the string with NUL. You can totally define your own terminating character, but you must understand that library functions involving string (as array of characters) may not work as you expect and it will cause all sorts of errors.
In the snippet of code given, there is a need to explicitly set the terminating character to NUL, since you don't know if there are trash data in the array allocated. It is also a good practice, since in large code, you may not see the initialization of the array of characters.
'\0' is the null termination character. If your character array didn't have it and you tried to do a strcpy you would have a buffer overflow. Many functions rely on it to know when they need to stop reading or writing memory.