Pointer arithmetic across subobject boundaries

三世轮回 提交于 2019-12-03 22:16:13

Updated: This answer at first missed some information and thus lead to wrong conclusions.

In your examples, initial and rest are clearly distinct (array) objects, so comparing pointers to initial (or its elements) with pointers to rest (or its elements) is

  • UB, if you use the difference of the pointers. (§5.7,6)
  • unspecified, if you use relational operators (§5.9,2)
  • well defined for == (So the second snipped is good, see below)

First snippet:

Building the difference in the first snippet is undefined behavior, for the quote you provided (§5.7,6):

Unless both pointers point to elements of the same array object, or one past the last element of the array object, the behavior is undefined.

To clarify the UB parts of the first example code:

//first example
int main()
{
    Derived<float, 10> d;
    assert(&d.rest[9] - &d.initial == 10);            //!!! UB !!!
    assert(&d.end - &d.begin == sizeof(float) * 10);  //!!! UB !!! (*)
    return 0;
}

The line marked with (*) is interesting: d.begin and d.end are not elements of the same array and therefore the operation results in UB. This is despite the fact you may reinterpret_cast<char*>(&d) and have both their addresses in the resulting array. But since that array is a representation of all of d, it's not to be seen as an access to parts of d. So while that operation probably will just work and give the expected result on any implementation one can dream of, it still is UB - as a matter of definition.

Second snippet:

This is actually well defined behavior, but implementation defined result:

int main()
{
    Derived<float, 10> d;
    assert(&d.rest[9] - &d.rest[0] == 9);
    assert(&d.rest[0] == &d.initial[1]);         //(!)
    assert(&d.initial[1] - &d.initial[0] == 1);
    return 0;
}

The line marked with (!) is not ub, but its result is implementation defined, since padding, alignment and the mentioned instumentation might play a role. But if that assertion would hold, you could use the two object parts like one array.

You would know that rest[0] would lay immediately after initial[0] in memory. At first sight, you could not easily use the equality:

  • initial[1] would point one-past-the-end of initial, dereferencing it is UB.
  • rest[-1] is clearly out of bounds.

But enters §3.9.2,3:

If an object of type T is located at an address A, a pointer of type cv T* whose value is the address A is said to point to that object, regardless of how the value was obtained. [ Note: For instance, the address one past the end of an array (5.7) would be considered to point to an unrelated object of the array’s element type that might be located at that address.

So provided that &initial[1] == &rest[0], it will be binary the same as if there was only one array, and all will be ok.

You could iterate over both arrays, since you could apply some "pointer context switch" at the boundaries. So to your last snippet: the swap is not needed!

However, there are some caveats: rest[-1] is UB, and so would be initial[2], because of §5.7,5:

If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

(emphasis mine). So how do these two fit together?

  • "Good path": &initial[1] is ok, and since &initial[1] == &rest[0] you can take that address and go on to increment the pointer to access the other elements of rest, because of §3.9.2,3
  • "Bad path": initial[2] is *(initial + 2), but since §5.7,5, initial +2 is already UB and you never get to use §3.9.2,3 here.

Together: you have to stop by at the boundary, take a short break to check that the addresses are equal and then you can move on.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!