How can I design storage that conforms to the standard's implementation of std::any?

问题

The standard working draft (n4582, 20.6.3, p.552) states the following suggestion for implementations of std::any:

Implementations should avoid the use of dynamically allocated memory for a small contained object. [ Example: where the object constructed is holding only an int. —end example ] Such small-object optimization shall only be applied to types T for which is_nothrow_move_constructible_v is true.

As far as I know, std::any can be easily implemented through type erasure/virtual functions and dynamically allocated memory.

How can std::any avoid dynamic allocation and still destroy such values if no compile time information is known at the time of destruction; how would a solution that follows the standard's suggestion be designed?

If anyone wants to see a possible implementation of the non-dynamic part, I've posted one on Code Review: https://codereview.stackexchange.com/questions/128011/an-implementation-of-a-static-any-type

It's a little too long for an answer here. It's based on the suggestions of Kerrek SB on the comments below.

回答1:

Typically, any takes anything and dynamically allocates a new object from it:

struct any {
    placeholder* place;

    template <class T>
    any(T const& value) {
        place = new holder<T>(value);
    }

    ~any() {
        delete place;
    }
};

We use the fact that placeholder is polymorphic to handle all of our operations - destruction, cast, etc. But now we want to avoid allocation, which means we avoid all the nice things that polymorphism gives us - and need to reimplement them. To start with, we'll have some union:

union Storage {
    placeholder* ptr;
    std::aligned_storage_t<sizeof(ptr), sizeof(ptr)> buffer;
};

where we have some template <class T> is_small_object { ... } to decide whether or not we're doing ptr = new holder<T>(value) or new (&buffer) T(value). But construction isn't the only thing we have to do - we also have to do destruction and type info retrieval, which look different depending on which case we're in. Either we're doing delete ptr or we're doing static_cast<T*>(&buffer)->~T();, the latter of which depends on keeping track of T!

So we introduce our own vtable-like thing. Our any will then hold onto:

enum Op { OP_DESTROY, OP_TYPE_INFO };
void (*vtable)(Op, Storage&, const std::type_info* );
Storage storage;

You could instead create a new function pointer for each op, but there are probably several other ops that I'm missing here (e.g. OP_CLONE, which might call for changing the passed-in argument to be a union...) and you don't want to just bloat your any size with a bunch of function pointers. This way we lose a tiny bit of performance in exchange for a big difference in size.

On construction, we then populate both the storage and the vtable:

template <class T,
          class dT = std::decay_t<T>,
          class V = VTable<dT>,
          class = std::enable_if_t<!std::is_same<dT, any>::value>>
any(T&& value)
: vtable(V::vtable)
, storage(V::create(std::forward<T>(value))
{ }

where our VTable types are something like:

template <class T>
struct PolymorphicVTable {
    template <class U>
    static Storage create(U&& value) {
        Storage s;
        s.ptr = new holder<T>(std::forward<U>(value));
        return s;
    }

    static void vtable(Op op, Storage& storage, const std::type_info* ti) {
        placeholder* p = storage.ptr;

        switch (op) {
        case OP_TYPE_INFO:
            ti = &typeid(T);
            break;
        case OP_DESTROY:
            delete p;
            break;
        }
    }
};

template <class T>
struct InternalVTable {
    template <class U>
    static Storage create(U&& value) {
        Storage s;
        new (&s.buffer) T(std::forward<U>(value));
        return s;
    }

    static void vtable(Op op, Storage& storage, const std::type_info* ti) {
        auto p = static_cast<T*>(&storage.buffer);

        switch (op) {
        case OP_TYPE_INFO:
            ti = &typeid(T);
            break;
        case OP_DESTROY:
            p->~T();
            break;
        }
    }
};

template <class T>
using VTable = std::conditional_t<sizeof(T) <= 8 && std::is_nothrow_move_constructible<T>::value,
                   InternalVTable<T>,
                   PolymorphicVTable<T>>;

and then we just use that vtable to implement our various operations. Like:

~any() {
    vtable(OP_DESTROY, storage, nullptr);
}

回答2:

How can std::any avoid dynamic allocation and still destroy such values if no compile time information is known at the time of destruction

That seems like a loaded question. The latest draft requires this constructor:

template <class ValueType> any(ValueType &&value);

I can't think of why you need to have "type erasure" unless you want code to handle both small and large cases at the same time. But then why not have something like this?¹

template <typename T>
  struct IsSmallObject : ...type_traits...

Small objects: use placement new and explicit destructor calls.
Large object: use an allocator

In the former case, you can have a pointer to your uninitialized storage:

union storage
{
    void* ptr;
    typename std::aligned_storage<3 * sizeof(void*), 
                std::alignment_of<void*>::value>::type buffer;
};

Using a union as @KerrekSB suggested.

Notice that the type is not needed to be known for the storage class. Using some kind of handle/dispatch (not sure about the real name of the idiom) system it becomes trivial at this point.

First let's tackle what destructing would look like:

  template <typename T>
  struct SmallHandler
  {
    // ...

    static void destroy(any & bye)
    {
        T & value = *static_cast<T *>(static_cast<void*>(&bye.storage.buffer));
        value.~T();
        this.handle = nullptr;
    }

    // ...
   };

Then the any class:

// Note, we don't need to know `T` here!
class any
{
  // ...

  void clear() _NOEXCEPT
  {
    if (handle) this->call(destroy);
  }

  // ...
  template <class>
  friend struct SmallHandler;
};

Here we factor out the logic that needs to know the compile-time type to a handler/dispatch system, while the bulk of the any class only has to deal with RTTI.

¹: Here's the conditions I would check for:

nothrow_move_constructible
sizeof(T) <= sizeof(storage). In my case this is 3 * sizeof(void*)
alignof(T) <= alignof(storage). In my case this is std::alignment_of<void*>::value

回答3:

Inspired by boost any I came up with this (test it on ideone) (I created a minimal case to show how to destroy a type erased container like any without dynamic memory. I focused only on constructor/destructor, omitting everything else, ignoring move semantics and other things)

#include <iostream>
#include <type_traits>

using std::cout;
using std::endl;

struct A { ~A() { cout << "~A" << endl; }};
struct B { ~B() { cout << "~B" << endl; }};

struct Base_holder {
  virtual ~Base_holder() {}
};

template <class T>
struct Holder : Base_holder {
  T value_;

  Holder(T val) : value_{val} {}
};

struct Any {  
  std::aligned_storage_t<64> buffer_;
  Base_holder* p_;

  template <class T>
  Any(T val)
  {
    p_ = new (&buffer_) Holder<T>{val};
  }

  ~Any()
  {
    p_->~Base_holder();
  }
};

auto main() -> int
{  
  Any a(A{});
  Any b(B{});

  cout << "--- Now we exit main ---" << endl;
}

The output:

~A
~A
~B
~B
--- Now we exit main ---
~B
~A

Of course the first one are temporaries being destroyed the last two prove that destruction of Any calls the right destructor.

The trick is to have polymorphism. That's why we have Base_holder and Holder. We initialize them via placement new in a std::aligned_storage and we explicitly call the destructor.

This is just to prove you can call the right destructor without knowing the type held by Any. Of course in a real implementation you would have an union for this or a pointer to a dynamically allocated memory and a boolean telling you which one you have.

来源：https://stackoverflow.com/questions/37094710/how-can-i-design-storage-that-conforms-to-the-standards-implementation-of-std

标签

c++

c++14

stdany