Observable behavior and undefined behavior — What happens if I don't call a destructor?

后端 未结 13 2575
滥情空心
滥情空心 2020-12-03 06:48

Note: I\'ve seen similar questions, but none of the answers are precise enough, so I\'m asking this myself.

This is a very nitpicky "language-lawye

相关标签:
13条回答
  • 2020-12-03 07:30

    Say you have a class that acquires a lock in its constructor and then releases the lock in its destructor. Releasing the lock is a side affect of calling the destructor.

    Now, it's your job to ensure that the destructor is called. Typically this is done by calling delete, but you can also call it directly, and this is usually done if you've allocated an object using placement new.

    In your example you've allocate 2 MakeRandom instances, but only called the destructor on one of them. If it were were managing some resource (like a file ) then you'd have a resource leak.

    So, to answer your question, yes, forgetting to call a destructor is different to forgetting to call an ordinary function. A destructor is the inverse of the constructor. You're required to call the constructor, and so you're required to call the destructor in order to "unwind" anything done by the destructor. This isn't the case with an "ordinary" function.

    0 讨论(0)
  • 2020-12-03 07:32

    I simply do not understand what "depends on the side effects" means.

    It means that it depends on something the destructor is doing. In your example, modifying *p or not modifying it. You have that dependency in your code, as the output would differ if the dctor wouldn't get called.

    In your current code, the number that is printed, might not be the same number that would have returned by the second rand() call. Your program invokes undefined behavior, but it's just that UB here has no ill effect.

    If you wouldn't print the value (or otherwise read it), then there wouldn't be any dependency on the side effects of the dcor, and thus no UB.

    So:

    Is forgetting to call a destructor any different than forgetting to call an ordinary function with the same body?

    Nope, it's not any different in this regard. If you depend on it being called, you must make sure it's called, otherwise your dependency is not satisfied.

    Furthermore, at what point is it correct to say the program "depends" on the destructor? Does it do so if the value was random -- or in general, if there is no way for me to distinguish the destructor from running vs. not running?

    Random or not doesn't matter, because the code depends on the variable being written to. Just because it's difficult to predict what the new value is doesn't mean there's no dependency.

    What if I never read the value?

    Then there's no UB, as the code has no dependency on the variable after it was written to.

    Under which condition(s), if any, does this program exhibit Undefined Behavior?

    There are no conditions. It's always UB.

    Exactly which expression(s) or statement(s) cause this, and why?

    The expression:

    printf("%d", x);
    

    because it introduces the dependency on the affected variable.

    0 讨论(0)
  • 2020-12-03 07:32

    For this answer, I will be using a 2012 C++11 release of the C++ standard, which can be found here (C++ standard), because this is freely available and up to date.

    The following three terms used in your question occur as followed:

    1. Destructor - 385 times
    2. Side effect - 71 times
    3. Depends - 41 times

    Sadly "depends on the side effect" appears only once, and DEPENDS ON is not an RFC standardized identifier like SHALL, so it's rather hard to pin down what depends means.

    Depends on

    Let's take an "activist judge" approach, and assume that "depends", "dependency", and "depending" are all used in a similar context in this document, that is, that the language was used to convey a broad idea rather than to convey a legalease concept.

    Then we can analyze this portion of page 1194:

    17.6.3.2
    Effect on original feature: Function swap moved to a different header
    Rationale: Remove dependency on for swap.
    Effect on original feature: Valid C++ 2003 code that has been compiled expecting swap to be in < algorithm > may have to instead include < utility >.

    This portion indicates a strict sort of dependency; you originally needed to include to get std::swap. "depends on" therefore indicated a strict requirement, a necessity so to speak, in the sense that there is not sufficient context without the requirement to proceed; failure will occur without the dependency.

    I chose this passage because it conveys the intended meaning as clearly as possible; other passages are more verbose, but they all include a similar meaning: necessity.

    Therefore, a "depends on" relationship means that the thing being depended on is required for the depending item to make sense, be whole and complete, and be usable in a context.

    To cut through that legalese red tape, this means A depends on B means A requires B. This is basically what you'd understand "depend" to mean if you looked it up in a dictionary or spoke it in a sentence.

    Side effect

    This is more strictly defined, on page 10:

    Accessing an object designated by a volatile glvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

    This means that anything which results in a change to the environment (such as RAM, network IO, variables, etc etc) are side effects. This neatly fits with the notion of impurity/purity from functional languages, which is clearly what was intended. Note that the C++ standard does not require that such side effects be observable; modifying a variable in any way, even if that variable is never looked at, is still a side effect.

    However, due to the "as if" rule, such unobservable side effects may be removed, page 8:

    A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

    Depends on the side effects

    Putting these two definitions together, we can now define this phrase: something depends on the side effects when those changes to the execution environment are required in order to satisfy the senseful, whole, and complete operations of the program. If, without the side effects, some constraint is not satisfied that is required for the program to operate in a standard compliant way, we can say that it depends on the side effects.

    A simple example to illustrate this would be, as stated in another answer, a lock. A program that uses locks depends on the side effect of the lock, notably, the side effect of providing a serialized access pattern to some resource (simplified). If this side effect is violated, the constraints of the program are violated, and thus the program cannot be thought of as senseful (since race conditions or other hazards may occur).

    The program DEPENDS on the constraints that a lock provides, via side effects; violating those results in a program that is invalid.

    Depends on the side effects produced by the destructor

    Changing the language from referring to a lock to a destructor is simple and obvious; if the destructor has side effects which satisfy some constraint that is required by the program to be senseful, whole, complete, and usable, then it depends on the side effects produced by the destructor. This is not exactly difficult to understand, and follows quite readily from both a legalese interpretation of the standard and a cursory layman understanding of the words and how they are used.

    Now we can get into answering your questions:

    Under which condition(s), if any, does this program exhibit Undefined Behavior?

    Any time a dependency or requirement is not fulfilled because a destructor is not called, the behavior of any dependent code is undefined. But what does this really mean?

    1.3.24 undefined behavior
    behavior for which this International Standard imposes no requirements

    [ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data.

    Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

    Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. — end note ]

    Let's suppose for a moment that such behavior WAS defined.

    Suppose it was explicitly illegal. This would then require any standard compiler to detect this case, to diagnose it, to deal with it in some fashion. For example, any object not explicitly deleted would have to be deleted at program exit, requiring some sort of tracking mechanism and ability to issue destructors to arbitrary types, possibly not known at compile time. This is basically a garbage collector, but given it's possibly hide pointers, it's possible to call malloc, etc etc, it would be essentially infeasible to require this.

    Suppose it was explicitly allowed. This would also allow compilers to remove destructor calls, under the as-if rule, since hey, you can't depend on that behavior anyway. This would result in some nasty surprises, mostly related to memory not freeing very quickly or easily. To get around that, we'd all start using finalizers, and the problem arises yet again. Furthermore, allowing that behavior means that no library can be sure when their memory is recovered or if it ever will be, or if their locks, OS dependent resources, etc etc, will ever get returned. This pushes the requirements for clean up from the code using the resources to the code providing it, where it's basically impossible to deal with in a language like C or C++.

    Suppose it had a specific behavior; what behavior would this be? Any such behavior would have to be quite involved or it wouldn't cover the large number of cases. We've already covered two, and the idea of cleaning up for any given object at program exit imposes a large overhead. For a language meant to be fast or at least minimal, this is clearly an unnecessary burden.

    So instead, the behavior was labeled undefined, meaning any implementation is free to provide diagnostics, but also free to simply ignore the problem and leave it to you to figure out. But no matter what, if you depend on those constraints being satisfied but fail to call the destructor, you are getting undefined behavior. Even if the program works perfectly well, that behavior is undefined; it may throw an error message in some new version of Clang, it may delete your hard drive in some incredibly secure cryptographic OS of the far flung future, it may work until the end of time.

    But it's still undefined.

    Your Example

    Your example does not satisfy the "depends on" clause; no constraint that is required for the program to run is unsatisfied.

    1. Constructor requires a well formed pointer to a real variable: satisfied
    2. new requires a properly allocated buffer: satisfied
    3. printf requires an accessible variable, interpretable as an integer: satisfied

    No where in this program does a certain value for x or a lack of that value result in a constraint being dissatisfied; you are not invoking undefined behavior. Nothing "depends" on these side effects; if you were to add a test which functioned as a constraint that required a certain value for "x", then it would be undefined behavior.

    As it stands, your example is not undefined behavior; it's merely wrong.

    Finally!

    Is forgetting to call a destructor any different than forgetting to call an ordinary function with the same body?

    It is impossible in many cases to define an ordinary function with the same body:

    1. A destructor is a member, not an ordinary function
    2. A function cannot access private or protected values
    3. A function cannot be required to be called upon destruction
    4. A finalizer also cannot be required to be called upon destruction
    5. An ordinary function cannot restore the memory to the OS without calling the destructor

    And no, calling free on an allocated object cannot restore the memory; free/malloc need not work on things allocated with new, and without calling the destructor, the private data members will not be released, resulting in a memory leak.

    Furthermore, forgetting to call a function will not result in undefined behavior if your program depends on the side effects it imposes; those side effects will simply not be imposed, and your program will not satisfy those constraints, and probably not work as intended. Forgetting to call a destructor, however, results in undefined behavior, as stated on page 66:

    For an object of a class type with a non-trivial destructor, the program is not required to call the destructor explicitly before the storage which the object occupies is reused or released; however, if there is no explicit call to the destructor or if a delete-expression (5.3.5) is not used to release the storage, the destructor shall not be implicitly called and any program that depends on the side effects produced by the destructor has undefined behavior.

    As you referenced in your original question. I don't see why you had to ask the question, given you already referenced it, but there you go.

    0 讨论(0)
  • 2020-12-03 07:42

    In the comments you've left a simple question that made me rethink what I said. I've removed the old answer because even if it had some value, it was far from the point.

    So you're saying my code is well-defined, since it "doesn't depend on that even if I print it"? No undefined behavior here?

    Let me say again that I don't precisely remember the definition of placement new operator and deallocation rules. Actually, I've not even read the newest C++ standard in full. But if the text you quoted is from there, then you are hitting the UB.

    Not due to Rand or Print. Or anything we "see".

    Any UB that occurs here is because your code assumes that you can safely "overwrite" an old 'object' without destroying the previous instance that was sitting at that place. The core sideeffect of a destructor is not "freeing handles/resources" (which you do manually in your code!) but leaving the space "ready for being reclaimed/reused".

    You have assumed that the usage of the memory chunks and lifetimes of objects are not well-tracked. I'm pretty sure that the C++ standard does not define that they are untracked.

    For example, imagine that you have the same code as provided, but that this struct/class has a vtable. Imagine that you are using hyper-picky compiler which has tons of debugchecks that manages the vtable with extra care and allocates some extra bitflag and that injects code into base constructors and destructors that flips that flag to help to trace errors. On such compiler, this code would crash on the line of new (r) MakeRandom since first object's lifetime has not been terminated. And I'm pretty sure that such picky compiler would still be fully C++ compliant, just as your compiler surely is too.

    It's an UB. It's only that most compilers really don't do such checks.

    0 讨论(0)
  • 2020-12-03 07:45

    This makes sense if you accept that the Standard is requiring allocation to be balanced by destruction in the case where destructors affect program behavior. I.e. the only plausible interpretation is that if a program

    • ever fails to call the destructor (perhaps indirectly through delete) on an object and
    • said destructor has side-effects,

    then the program is doomed to the land of UB. (OTOH, if the destructor doesn't affect program behavior, then you are off the hook. You can skip the call.)

    Note added Side effects are discussed in this SO article, and I'll not repeat that here. A conservative inference is that "program ... depends on destructor" is equivalent to "destructor has a side-effect."

    Additional note However, the Standard seems to allow for a more liberal interpretation. It does not formally define dependence of a program. (It does define a specific quality of expressions as dependence-carrying, but this does not apply here.) Yet in over 100 uses of derivatives of "A depends on B" and "A has a dependency on B," it employs the conventional sense of the word: a variation in B leads directly to variation in A. Consequently, it does not seem a leap to infer that a program P depends on side effect E to the extent that performance or non-performance of E results in a variation in observable behavior during execution of P. Here we are on solid ground. The meaning of a program - its semantics - is equivalent under the Standard to its observable behavior during execution, and this is clearly defined.

    The least requirements on a conforming implementation are:

    • Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

    • At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

    • The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

    These collectively are referred to as the observable behavior of the program.

    Thus, by the Standard's conventions, if a destructor's side effect would ultimately affect volatile storage access, input, or output, and that destructor is never called, the program has UB.

    Put yet another way: If your destructors do significant things and aren't consistently called, your program (says the Standard) ought to be considered, and is hereby declared, useless.

    Is this overly restrictive, nay pedantic, for a language standard? (After all, the Standard prevents the side-effect from occurring due to an implicit destructor call and then drubs you if the destructor would have caused a variation in observable behavior if it had been called!) Perhaps so. But it does make sense as a way to insist on well-formed programs.

    0 讨论(0)
  • 2020-12-03 07:45

    First of all, we need to define undefined behavior, which according to the C FAQ would be when:

    Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended.

    Which, in other words, means that the programmer cannot predict what would happen once the program is executed. This doesn't mean that the program or OS would crash, it simple means that the program future state would only be know once that it is executed.

    So, explained in math notation, if a program is reduced to a function F which makes a transformation from an initial state Is into a final state Fs, given certain initial conditions Ic


    F(Is,Ic) -> Fs


    And if you evaluate the function (execute the program) n times, given that n-> ∞


    F(Is,Ic) -> Fs1, F(Is,Ic) -> Fs2, ..., F(Is,Ic) -> Fsn, n-> ∞


    Then:

    • A defined behavior would be given by all the resulting states being the same: Fs1 = Fs2 = ... = Fsn, given that n-> ∞
    • An undefined behavior would be given by the possibility of obtaining different finished states among different executions. Fs1 ≠ Fs2 ≠ ... ≠ Fsn, given that n-> ∞

    Notice how I highlight possibility, because undefined behavior is exactly that. There exists a possibility that the program executes as desired, but nothing guarantees that it would do so, or that it wouldn't do it.

    Hence, answering your answer:

    Is forgetting to call a destructor any different than forgetting to call an ordinary function with the same body?

    Given that a destructor is a function that could be called even when you don't explicitly call it, forgetting to call a destructor IS different from forgetting to call an ordinary function, and doing so COULD lead to undefined behavior.

    The justification is given by the fact that, when you forget to call an ordinary function you are SURE, ahead of time, that that function won't be called at any point in your program, even when you run your program an infinite number of times.

    However, when you forget to call a destructor, and you call your program an infinite number of times, and as is exemplified by this post: https://stackoverflow.com/questions/3179494/under-what-circumstances-are-c-destructors-not-going-to-be-called under certain circumstances, C++ destructors are not called, it means that you can't assure beforehand when the destructor would be called, nor when it wouldn't be. This uncertainty means that you can't assure the same final state, thus leading to UB.

    So answering your second question:

    Under which condition(s), if any, does this program exhibit Undefined Behavior?

    The circumstances would be given by the circumstances when the C++ destructors are not called, given on the link that I referenced.

    0 讨论(0)
提交回复
热议问题