问题
boost::serialization
contains this code:
reinterpret_cast<std::ptrdiff_t>(
static_cast<Derived *>(
reinterpret_cast<Base *>(1 << 20)
)
) - (1 << 20)
Its purpose is to calculate an offset between base and derived classes. Is this code free of undefined behaviour?
The reason why I am asking is that ASAN+UBSAN complain. For example, this code
#include <iostream>
class Foo { public: virtual void foo() {} };
class Base { public: virtual void base() {} };
class Derived: public Foo, public Base {};
int main()
{
std::cout <<
(reinterpret_cast<std::ptrdiff_t>(
static_cast<Derived *>(
reinterpret_cast<Base *>(1 << 20)
)
) - (1 << 20));
}
compiled as (gcc version 9.2.1)
g++ -fsanitize=address -fsanitize=undefined -fno-omit-frame-pointer -g main.cpp
produces this output
AddressSanitizer:DEADLYSIGNAL
=================================================================
==72613==ERROR: AddressSanitizer: SEGV on unknown address 0x0000000ffff8 (pc 0x0000004012d9 bp 0x7ffd5b3eecf0 sp 0x7ffd5b3eece0 T0)
==72613==The signal is caused by a READ memory access.
#0 0x4012d8 in main main.cpp:13
#1 0x7f74a90d5f42 in __libc_start_main (/lib64/libc.so.6+0x23f42)
#2 0x40112d in _start (/home/.../a.out+0x40112d)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV main.cpp:13 in main
Is it a false-positive or does this code really have problems?
Update 09.12.2019: based on Filipp's proposal and my experiments, this code seems to work and does not produce any warnings:
std::aligned_storage<sizeof(Derived)>::type data;
reinterpret_cast<char*>(&data)
- reinterpret_cast<char*>(
static_cast<Base*>(
reinterpret_cast<Derived*>(&data)));
Does anybody see any problems with this snippet? If not, I would propose it to boost
.
Update 16.12.2019: the fix was merged to boost::serialization
develop
branch.
回答1:
As indicated by another answer, the problem is that (1 << 20)
isn't the address of any object. Using a char[]
that could in principle store a Derived
seems to work around the problem:
#include <stdint.h>
#include <stddef.h>
#include <stdio.h>
class Foo { public: virtual void foo() {} };
class Base { public: virtual void base() {} };
class Derived: public Foo, public Base {};
int main() {
alignas (Derived) char const buffer[sizeof(Derived)] = {};
Derived const* const derived = reinterpret_cast<Derived const*>(buffer);
Base const* const base = derived;
ptrdiff_t const delta =
reinterpret_cast<char const*>(derived) -
reinterpret_cast<char const*>(base);
::printf("%td\n", delta);
return 0;
}
Is it a false-positive or does this code really have problems?
According to a strict reading of the standard the code does exhibit UB, so in that sense it is not a false positive. In practice, both boost authors and compiler writers agree this is just pointer math, so it should do the right thing anyway.
Edit: Unless one of the bases involved is virtual
. Then the cast would try to read the offset from the vtable.
Edit 2: Using nullptr
produces 0. Changed to using local aligned buffer.
回答2:
reinterpret_cast<Base *>(1 << 20)
This is not a valid pointer.
static_cast
ing it requires evaluating it, which has undefined behaviour.
This is an interesting "trick" but it doesn't seem to be well-defined, and that's borne out by the results of the -fsanitize
options.
This does not seem to be unusual for boost::serialization.
来源:https://stackoverflow.com/questions/59216758/derived-class-offset-calculation-in-boostserialization-is-it-valid