Ambiguous behaviour of strcmp()

本秂侑毒 提交于 2020-03-22 08:38:25

问题


Please note that I have checked the relevant questions to this title, but from my point of view they are not related to this question.

Initially I thought that program1 and program2 would give me the same result.

//Program 1

char *a = "abcd";
char *b = "efgh";
printf("%d", strcmp(a,b));


//Output: -4

//Program 2
printf("%d", strcmp("abcd", "efgh"));

//Output: -1

Only difference that I can spot is that in the program2 I have passed string literal, while in program I've passed char * as the argument of the strcmp() function.

Why there is a difference between the behaviour of these seemingly same program?

Platform: Linux mint compiler: g++

Edit: Actually the program1 always prints the difference of ascii code of the first mismatched characters, but the program2 print -1 if the ascii code of the first mismatched character in string2 is greater than that of string1 and vice versa.


回答1:


This is your C code:

int x1()
{
  char *a = "abcd";
  char *b = "efgh";
  printf("%d", strcmp(a,b));
}

int x2()
{
  printf("%d", strcmp("abcd", "efgh"));
}

And this is the generated assembly output for both functions:

.LC0:
        .string "abcd"
.LC1:
        .string "efgh"
.LC2:
        .string "%d"
x1:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], OFFSET FLAT:.LC0
        mov     QWORD PTR [rbp-16], OFFSET FLAT:.LC1
        mov     rdx, QWORD PTR [rbp-16]
        mov     rax, QWORD PTR [rbp-8]
        mov     rsi, rdx
        mov     rdi, rax
        call    strcmp              // the strcmp function is actually called
        mov     esi, eax
        mov     edi, OFFSET FLAT:.LC2
        mov     eax, 0
        call    printf
        nop
        leave
        ret

x2:
        push    rbp
        mov     rbp, rsp
        mov     esi, -1             // strcmp is never called, the compiler
                                    // knows what the result will be and it just
                                    // uses -1
        mov     edi, OFFSET FLAT:.LC2
        mov     eax, 0
        call    printf
        nop
        pop     rbp
        ret

When the compiler sees strcmp("abcd", "efgh") it knows the result beforehand, because it knows that "abcd" comes before "efgh".

But if it sees strcmp(a,b) it does not know and hence generates code that actually calls strcmp.

With another compiler or with different compiler settings things could be different. You really shouldn't care about such details at least at a beginner's level.




回答2:


It is indeed surprising that strcmp returns 2 different values for these calls, but it is not incompatible with the C Standard:

strcmp() returns a negative value if the first string is lexicographically before the second string. Both -4 and -1 are negative values.

As pointed by others, the code generated for the different calls is different:

  • the compiler generates a call to the library function in the first program
  • the compiler is able to determine the result of the comparison and generates an explicit result of -1 for the second case where both arguments are string literals.

In order to perform this compile time evaluation, strcmp must be defined in a subtile way in <string.h> so the compiler can determine that the program refers to the C library's implementation and not an alternative that might behave differently. Tracing the corresponding prototype in recent GNU libc include files is a bit difficult with a number of nested macros eventually leading to a hidden prototype.

Note that more recent versions of both gcc and clang will perform the optimisation in both cases as can be tested on Godbolt Compiler Explorer, but neither combines this optmisation with that of printf to generate the even more compact code puts("-1");. They seem to convert printf to puts only for string literal formats without arguments.




回答3:


I believe (would need to see (and interpret) machine code) one version works without calling code in the library (as if you wrote printf("%d", -1);).



来源:https://stackoverflow.com/questions/60306258/ambiguous-behaviour-of-strcmp

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!