How is the string terminator '\\0' has the same value as integer constant 0?

匿名 (未验证) 提交于 2019-12-03 02:38:01

问题:

I have the following code -

#include <stdio.h> #define LENGTH 5 int main(){     char* ch[LENGTH] = {"Zero", "One", "Two", "Three", "Four"};     char* pc;     char** ppc;     for(int i=0; i<LENGTH; i++){         ppc = ch+i;         pc = *ppc;         while(*pc != 0){             printf("%c ", *pc);             pc = pc +1;         }         printf("\n");     }     return 0; }

It is an example of multiple indirection using string.

The output is

Z e r o  O n e  T w o  T h r e e  F o u r 

Here in while() loop instead of *pc != '\0', *pc != 0 is used.

But both the approaches give same output. Why is it so?

回答1:

Line feed \n, tab \t etc has their own escape sequence characters, but actually there does not exist one for the null terminator.

The industry de facto standard way of represending the null terminator is therefore to write an octal escape sequence with the value zero. Octal escape sequences are defined as \ followed by a number. So \0 simply means zero, with octal representation. Since this looks similar to other character escape sequences, it has become the de facto standard way of representing the null terminator.

This is why a decimal 0 works just as fine, it is just another way of writing the value zero. You could as well write \x0 if you wish to be obscure.



回答2:

A char is really nothing more than a small integer, and as such are implicitly convertible to int. Furthermore character literals (like e.g. 'A') are really represented by the compiler as int values (for example the literal character 'A' is represented by the int value 65 in ASCII encoding).

The C language allows one to insert any arbitrary integer (that can fit in a char) using escapes. There are two ways to escape such arbitrary values, using octal numbers, or using hexadecimal. For example, the ASCII value for A is 65, that can be represented as either 'A', '\101' in octal, '\x41' in hexadecimal, or plain 65.

Armed with that information it should be easy to see that the character literal '\0' is the octal representation of the integer 0. That is, '\0' == 0.

You can easily verify this by printing it:

printf("'\\0' = %d\n", '\0');

I mentioned that the compiler treats all character literals as int values, but also mentioned that the arbitrary numbers using escaped octal or hexadecimal numbers needs to fit in a char. That might seem like a contradiction, but it isn't really. A characters value must fit in a char, but the compiler will then internally convert it to an int when it parses the code.



回答3:

0 and '\0' are exactly the same value, and in C, are both int types. This is fixed by the C standard and is irrespective of the character encoding on your platform. In other words, they are completely indistinguishable. (In C++, the type of '\0' is a char.)

So while(*pc != 0), while(*pc != '\0'), and while(*pc) for that matter are all the same thing.

(Personally I find the last one I give the clearest, but some folk like to use the '\0' notation when working with C-style strings.)



回答4:

Adding to the existing answers, to look into the sentinel, quoting C11, chapter §5.2.1

In a character constant or string literal, members of the execution character set shall be represented by corresponding members of the source character set or by escape sequences consisting of the backslash \ followed by one or more characters. A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string.

and from chapter §6.4.4.4/P12,

EXAMPLE 1 The construction '\0' is commonly used to represent the null character.

So, a constant \0 is the one which satisfies the aforesaid property. This is a octal escape sequence.

Now, regarding the value, quoting §6.4.4.4/P5, (emphasis mine)

The octal digits that follow the backslash in an octal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant. The numerical value of the octal integer so formed specifies the value of the desired character or wide character.

so, for a octal escape sequence '\0', the value is 0 (well, both in octal, as mentioned in §6.4.4.1, and decimal).



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!