This code is just for illustrating the question.
#include
struct MyCallBack {
void Fire() {
}
};
int main()
{
MyCallBack cb;
I propose a custom class for your specific usage.
While it's true that you shouldn't try to re-implement existing library functionality because the library ones will be much more tested and optimized, it's also true that it applies for the general case. If you have a particular situation like in your example and the standard implementation doesn't suite your needs you can explore implementing a version tailored to your specific use case, which you can measure and tweak as necessary.
So I have created a class akin to std::function<void (void)>
that works only for methods and has all the storage in place (no dynamic allocations).
I have lovingly called it Trigger
(inspired by your Fire
method name). Please do give it a more suited name if you want to.
// helper alias for method
// can be used in user code
template <class T>
using Trigger_method = auto (T::*)() -> void;
namespace detail
{
// Polymorphic classes needed for type erasure
struct Trigger_base
{
virtual ~Trigger_base() noexcept = default;
virtual auto placement_clone(void* buffer) const noexcept -> Trigger_base* = 0;
virtual auto call() -> void = 0;
};
template <class T>
struct Trigger_actual : Trigger_base
{
T& obj;
Trigger_method<T> method;
Trigger_actual(T& obj, Trigger_method<T> method) noexcept : obj{obj}, method{method}
{
}
auto placement_clone(void* buffer) const noexcept -> Trigger_base* override
{
return new (buffer) Trigger_actual{obj, method};
}
auto call() -> void override
{
return (obj.*method)();
}
};
// in Trigger (bellow) we need to allocate enough storage
// for any Trigger_actual template instantiation
// since all templates basically contain 2 pointers
// we assume (and test it with static_asserts)
// that all will have the same size
// we will use Trigger_actual<Trigger_test_size>
// to determine the size of all Trigger_actual templates
struct Trigger_test_size {};
}
struct Trigger
{
std::aligned_storage_t<sizeof(detail::Trigger_actual<detail::Trigger_test_size>)>
trigger_actual_storage_;
// vital. We cannot just cast `&trigger_actual_storage_` to `Trigger_base*`
// because there is no guarantee by the standard that
// the base pointer will point to the start of the derived object
// so we need to store separately the base pointer
detail::Trigger_base* base_ptr = nullptr;
template <class X>
Trigger(X& x, Trigger_method<X> method) noexcept
{
static_assert(sizeof(trigger_actual_storage_) >=
sizeof(detail::Trigger_actual<X>));
static_assert(alignof(decltype(trigger_actual_storage_)) %
alignof(detail::Trigger_actual<X>) == 0);
base_ptr = new (&trigger_actual_storage_) detail::Trigger_actual<X>{x, method};
}
Trigger(const Trigger& other) noexcept
{
if (other.base_ptr)
{
base_ptr = other.base_ptr->placement_clone(&trigger_actual_storage_);
}
}
auto operator=(const Trigger& other) noexcept -> Trigger&
{
destroy_actual();
if (other.base_ptr)
{
base_ptr = other.base_ptr->placement_clone(&trigger_actual_storage_);
}
return *this;
}
~Trigger() noexcept
{
destroy_actual();
}
auto destroy_actual() noexcept -> void
{
if (base_ptr)
{
base_ptr->~Trigger_base();
base_ptr = nullptr;
}
}
auto operator()() const
{
if (!base_ptr)
{
// deal with this situation (error or just ignore and return)
}
base_ptr->call();
}
};
Usage:
struct X
{
auto foo() -> void;
};
auto test()
{
X x;
Trigger f{x, &X::foo};
f();
}
Warning: only tested for compilation errors.
You need to thoroughly test it for correctness.
You need to profile it and see if it has a better performance than other solutions. The advantage of this is because it's in house cooked you can make tweaks to the implementation to increase performance on your specific scenarios.
Many std::function implementations will avoid allocations and use space inside the function class itself rather than allocating if the callback it wraps is "small enough" and has trivial copying. However, the standard does not require this, only suggests it.
On g++, a non-trivial copy constructor on a function object, or data exceeding 16 bytes, is enough to cause it to allocate. But if your function object has no data and uses the builtin copy constructor, then std::function won't allocate. Also, if you use a function pointer or a member function pointer, it won't allocate.
While not directly part of your question, it is part of your example. Do not use std::bind. In virtually every case, a lambda is better: smaller, better inlining, can avoid allocations, better error messages, faster compiles, the list goes on. If you want to avoid allocations, you must also avoid bind.
Unfortunately, allocators for std::function
has been dropped in C++17.
Now the accepted solution to avoid dynamic allocations inside std::function
is to use lambdas instead of std::bind
. That does work, at least in GCC - it has enough static space to store the lambda in your case, but not enough space to store the binder object.
std::function<void()> func = [&cb]{ cb.Fire(); };
// sizeof lambda is sizeof(MyCallBack*), which is small enough
As a general rule, with most implementations, and with a lambda which captures only a single pointer (or a reference), you will avoid dynamic allocations inside std::function
with this technique (it is also generally better approach as other answer suggests).
Keep in mind, for that to work you need guarantee that this lambda will outlive the std::function
. Obviously, it is not always possible, and sometime you have to capture state by (large) copy. If that happens, there is no way currently to eliminate dynamic allocations in functions, other than tinker with STL yourself (obviously, not recommended in general case, but could be done in some specific cases).
As an addendum to the already existent and correct answer, consider the following:
MyCallBack cb;
std::cerr << sizeof(std::bind(&MyCallBack::Fire, &cb)) << "\n";
auto a = [&] { cb.Fire(); };
std::cerr << sizeof(a);
This program prints 24 and 8 for me, with both gcc and clang. I don't exactly know what bind
is doing here (my understanding is that it's a fantastically complicated beast), but as you can see, it's almost absurdly inefficient here compared to a lambda.
As it happens, std::function
is guaranteed to not allocate if constructed from a function pointer, which is also one word in size. So constructing a std::function
from this kind of lambda, which only needs to capture a pointer to an object and should also be one word, should in practice never allocate.
Run this little hack and it probably will print the amount of bytes you can capture without allocating memory:
#include <iostream>
#include <functional>
#include <cstring>
void h(std::function<void(void*)>&& f, void* g)
{
f(g);
}
template<size_t number_of_size_t>
void do_test()
{
size_t a[number_of_size_t];
std::memset(a, 0, sizeof(a));
a[0] = sizeof(a);
std::function<void(void*)> g = [a](void* ptr) {
if (&a != ptr)
std::cout << "malloc was called when capturing " << a[0] << " bytes." << std::endl;
else
std::cout << "No allocation took place when capturing " << a[0] << " bytes." << std::endl;
};
h(std::move(g), &g);
}
int main()
{
do_test<1>();
do_test<2>();
do_test<3>();
do_test<4>();
}
With gcc version 8.3.0
this prints
No allocation took place when capturing 8 bytes.
No allocation took place when capturing 16 bytes.
malloc was called when capturing 24 bytes.
malloc was called when capturing 32 bytes.