In C is there any guarantee with code prior to undefined behavior?

人走茶凉 提交于 2020-01-13 08:11:46

问题


In the following code is is guaranteed that "0\n" be printed?

#include <stdio.h>
int main(void)
{
    int c = 0;
    printf("%d\n",c);

    printf("%d,%d\n",++c,++c);
}

More generally, if a program has undefined behavior does the entire program become undefined or only from the sequence point that begins the problematic code?

Please note: I am not asking about what the compiler does with the second printf. I am asking if the first printf is guaranteed to occur.

I know that undefined behavior is capable of blowing up your computer, crashing your program, or whatnot.


回答1:


Well even ignoring things like "Anything could happen! the program could travel back in time and prevent itself from running in the first place!", it's perfectly possible for a compiler to detect some forms of undefined behavior and not compile in that case in which case you'll wouldn't have gotten it to run in the first place. So yes, undefined behavior is contagious in principle if not necessarily so in practice most of the time.




回答2:


Whatever has been done by the program before it causes undefined behavior is of course already done.

So the printf() would have sent the "0\n" to the stdout stream. Whether that data actually made it to the device depends on whether or not that stream is unbuffered, buffered, or line-buffered.

Then again, I suppose that it's possible that undefined behavior executed subsequent to the completed, well-defined actions might cause damage to the extent that it appears that the well-defined behavior didn't complete correctly. I guess kind of like one of those "if a tree falls in the woods...." things.


Update to address the belief that future undefined behavior means all bets are off even before a program starts executing...

Here's what the C99 standard has to say about modifying the value of an object more than once between sequence points:

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.

And the standard also has this to say about access to an object:

access

 <execution-time action> to read or modify the value of an object
 NOTE 1   Where only one of these two actions is meant, ``read'' or ``modify'' is used.
 NOTE 2   "Modify'' includes the case where the new value being stored is the same as the previous value.
 NOTE 3   Expressions that are not evaluated do not access objects.

I don't think that modifying an object more than once between sequence points is 'undefined behavior' at translation time, since objects aren't accessed/modified at translation time.

Even so, I agree that a compiler that diagnoses this undefined behavior at compile time would be a good thing, but I also think this question is more interesting if taken to apply only to programs that have been successfully compiled. So let's change the question a little bit to give a situation where the compiler can't diagnose undefined behavior at translation time:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[])
{
    int c[] = { 0, 1, 2, 3 };
    int *p1 = &c[0];
    int *p2 = &c[1];

    if (argc > 1) {
        p1 = &c[atoi(argv[1])];
    }
    if (argc > 2) {
        p2 = &c[atoi(argv[2])];
    }

    printf("before: %d, %d\n", *p1, *p2);

    printf("after:  %d, %d\n", ++(*p1),++(*p2)); /* possible undefined behavior */

    return 0;
}

In this program the undefined behavior can't even be known to exist at translation time - it only occurs if the input to the program indicates that the same array element should be processed (or a different type of undefined behavior can occur if the input specifies invalid index values).

So lets pose the same question with this program: what does the standard say about the what might happen to the first printf() results or side-effects?

If the inputs provide valid index values the undefined behavior can only happen after the first printf(). Assume the input is argv[1] == "1" and argv[2] == "1": the compiler implementation does not have the freedom to determine before the first printf() that since undefined behavior will happen at some point in the program it's allowed to skip the first printf() and go right to its undefined behavior of formatting the hard disk (or whatever other horrors might happen).

Given that the compiler agrees agrees to translate a program, the promise of future undefined behavior doesn't give the compiler the freedom to do whatever it wants before that undefined behavior actually takes place. Of course, as I mentioned before, the damage done by the undefined behavior could possibly destroy the previous results - but those results had to have happened.




回答3:


Undefined behavior is up to the compiler vendor/random chance. That means it could throw an exception, corrupt data in your program, write over your mp3 collection, call down an angel, or light your grandmother on fire. Once you have undefined behavior, your entire program becomes undefined.

Some compilers and some compiler configurations will provide modes that throw you a bone, but once you turn on optimizations, most programs will behave pretty poorly.

if a program has undefined behavior does the entire program become undefined or only from the sequence point that begins the problematic code?

The code running up to the undefined point will probably have done the right thing. But that only does so much good. Once you hit undefined behavior, literally anything can happen. Whether something will happen is covered by Murphy's Law :)

Optimizations rely on well defined behavior, and play all sorts of tricks outside that box to gain speed. This means your code can execute completely out of order, as long as the side effects would be indistinguishable for a well defined program. Just because undefined behavior seems to start at a certain point in your source code does not guarantee that any previous lines of code will be immune. With optimizations enabled, your code can very easily hit the undefined behavior much earlier.

Food for thought: Buffer overrun exploits, as implemented by various types of malware, rely heavily on undefined behavior.




回答4:


For undefined behavior one should probably distinguish between things that are detectable at compile time (as is your case) and things that are data dependent and only occur at run time, as e.g accidentally writing to a const qualified object.

Where for the later the program must run until the UB occurs, because it usually can't detect it beforehand (model checking is a tough task for non trivial programs), for your case it might be allowed to produce any kind of program that e.g sends some money to the compiler vendor or so ;-)

A more reasonable choice would be to produce just nothing, that is to throw an error and not to compile at all. Some compilers do that when they are told so, e.g with gcc you get this with -Wall -std=c99 --pedantic -Werror.



来源:https://stackoverflow.com/questions/4002173/in-c-is-there-any-guarantee-with-code-prior-to-undefined-behavior

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!