Does giving data an effective type count as a side-effect?

前端 未结 2 2070
半阙折子戏
半阙折子戏 2021-02-06 07:31

Suppose I have a chunk of dynamically allocated data:

void* allocate (size_t n)
{
  void* foo = malloc(n);
  ...
  return foo;
}

I wish to use

2条回答
  •  太阳男子
    2021-02-06 07:57

    Nothing in the Standard would suggest that an operation which writes to an object would only need to be recognized as setting the Effective Type in cases where the operation has other side-effects as well (such as changing the pattern of bits stored in that object). On the other hand, compilers that use aggressive type-based optimization seem unable to recognize a possible change of an object's Effective Type as a side-effect which must be maintained even if the write would have no other observable side-effects.

    To understand what the Effective Type rule actually says, I think it's necessary to understand where it came from. So far as I can tell, it appears to be derived from Defect Report #028, more specifically the rationale used to justify the conclusion given therein. The conclusion given is reasonable, but the rationale given is absurd.

    Essentially, the basic premise involves the possibility of something like:

    void actOnTwoThings(T1 *p1, T2 *p2)
    {
      ... code that uses p1 and p2
    }
    ...
    ...in some other function
      union {T1 v1; T2 v2; } u;
      actOnTwoThings(&u.v1, &u.v2);
    

    Because that act of writing a union as one type and reading as another yields Implementation-Defined behavior, the behavior of writing one union member via pointer and reading another isn't fully defined by the Standard, and should therefore (by the logic of DR #028) be treated as Undefined Behavior. Although the use of p1 and p2 to access the same storage in should in fact be treated as UB in many scenarios like the above, the rationale is totally faulty. Specifying that an action yields implementation-Defined Behavior is very different from saying that it yields Undefined Behavior, especially in cases where the Standard would impose limits on what the Implementation-Defined behavior could be.

    A key result of deriving pointer-type rules from the behavior of unions is that behavior is fully and unambiguously defined, with no Implementation-Defined aspects, if code writes a union any number of times using any members, in any sequence, and then reads the last member written. While requiring that implementations allow for this will block some otherwise-useful optimizations, it's pretty clear that the Effective Type rules are written to require such behavior.

    A bigger problem that arising from basing type rules on the behavior of unions is that the action of reading a union using one type and writing the union with another type need not be regarded as having any side-effects if the new bit pattern matches the old. Since an implementation would have to define the new bit pattern as representing the value that was written as the new type, it would also have to define the (identical) old bit pattern as representing that same value. Given the function (assume 'long' and 'long long' are the same type):

     long test(long *p1, long long *p2, void *p3)
     {
       if (*p1)
       {
         long long temp;
         *p2 = 1;
         temp = *(long long*)p3;
         *(long*)p3 = temp;
       }
       return *p1;
     }
    

    both gcc and clang will decide that the write via *(long*)p3 can't have any effect since it's simply storing back the same bit pattern that had been read via *(long long*)p3, which would be true if the following read of *p1 were going to be processed in Implementation-Defined behavior in the event the storage was written via *p2, but isn't true if that case is regarded as UB. Unfortunately, since the Standard is inconsistent about whether behavior is Implementation-Defined or Undefined, it's inconsistent about whether the write needs to be regarded as a side-effect.

    From a practical perspective, when not using -fno-strict-aliasing, gcc and clang should be regarded as processing a dialect of C where Effective Types, once set, become permanent. They cannot reliably recognize all cases where Effective Types may be changed, and the logic necessary to handle that could easily and efficiently handle many cases which the authors of gcc have long claimed cannot possibly be handled without gutting optimization.

提交回复
热议问题