Is GCC's option -O2 breaking this small program or do I have undefined behavior [duplicate]

懵懂的女人 提交于 2019-12-18 12:56:20

问题


I found this problem in a very large application, have made an SSCCE from it. I don't know whether the code has undefined behavior or -O2 breaks it.

When compiling it with gcc a.c -o a.exe -O2 -Wall -Wextra -Werror it prints 5.

But it prints 25 when compiling without -O2 (eg -O1) or uncommenting one of the 2 commented lines (prevent inlining).

#include <stdio.h>
#include <stdlib.h>
// __attribute__((noinline)) 
int f(int* todos, int input) {
    int* cur = todos-1; // fixes the ++ at the beginning of the loop
    int result = input;
    while(1) {
        cur++;
        int ch = *cur;
        // printf("(%i)\n", ch);
        switch(ch) {
            case 0:;
                goto end;
            case 1:;
                result = result*result;
            break;
        }
    }
    end:
    return result;
}
int main() {
    int todos[] = { 1, 0}; // 1:square, 0:end
    int input = 5;
    int result = f(todos, input);
    printf("=%i\n", result);
    printf("end\n");
    return 0;
}

Is GCC's option -O2 breaking this small program or do I have undefined behavior somewhere?


回答1:


int* cur = todos-1;

invokes undefined behavior. todos - 1 is an invalid pointer address.

Emphasis mine:

(C99, 6.5.6p8) "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."




回答2:


In supplement to @ouah's answer, this explains what the compiler is doing.

Generated assembler for reference:

  400450:       48 83 ec 18             sub    $0x18,%rsp
  400454:       be 05 00 00 00          mov    $0x5,%esi
  400459:       48 8d 44 24 fc          lea    -0x4(%rsp),%rax
  40045e:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
  400465:       00 
  400466:       48 83 c0 04             add    $0x4,%rax
  40046a:       8b 10                   mov    (%rax),%edx

However if I add a printf in main():

  400450:       48 83 ec 18             sub    $0x18,%rsp
  400454:       bf 84 06 40 00          mov    $0x400684,%edi
  400459:       31 c0                   xor    %eax,%eax
  40045b:       48 89 e6                mov    %rsp,%rsi
  40045e:       c7 04 24 01 00 00 00    movl   $0x1,(%rsp)
  400465:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)
  40046c:       00 
  40046d:       e8 ae ff ff ff          callq  400420 <printf@plt>
  400472:       48 8d 44 24 fc          lea    -0x4(%rsp),%rax
  400477:       be 05 00 00 00          mov    $0x5,%esi
  40047c:       48 83 c0 04             add    $0x4,%rax
  400480:       8b 10                   mov    (%rax),%edx

Specifically (in the printf version), these two instructions populate the todo array

  40045e:       c7 04 24 01 00 00 00    movl   $0x1,(%rsp)
  400465:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)

This is conspicuously missing from the non-printf version, which for some reason only assigns the second element:

  40045e:       c7 44 24 04 00 00 00    movl   $0x0,0x4(%rsp)


来源:https://stackoverflow.com/questions/23683029/is-gccs-option-o2-breaking-this-small-program-or-do-i-have-undefined-behavior

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!