I am reading \"Effective Modern C++\". In the item related to std::unique_ptr
it\'s stated that if the custom deleter is a stateless object, then no size fees o
From a unique_ptr
implementation:
template<class _ElementT, class _DeleterT = std::default_delete<_ElementT>>
class unique_ptr
{
public:
// public interface...
private:
// using empty base class optimization to save space
// making unique_ptr with default_delete the same size as pointer
class _UniquePtrImpl : private deleter_type
{
public:
constexpr _UniquePtrImpl() noexcept = default;
// some other constructors...
deleter_type& _Deleter() noexcept
{ return *this; }
const deleter_type& _Deleter() const noexcept
{ return *this; }
pointer& _Ptr() noexcept
{ return _MyPtr; }
const pointer _Ptr() const noexcept
{ return _MyPtr; }
private:
pointer _MyPtr;
};
_UniquePtrImpl _MyImpl;
};
The _UniquePtrImpl
class contains the pointer and derives from the deleter_type
.
If the deleter happens to be stateless, the base class can be optimized so that it takes no bytes for itself. Then the whole unique_ptr
can be the same size as the contained pointer - that is: the same size as an ordinary pointer.
In fact there will be a size penalty for lambdas that are not stateless, i.e., lambdas that capture one or more values.
But for non-capturing lambdas, there are two key facts to notice:
Therefore, the compiler is able to invoke the lambda purely based on its type, which is recorded as part of the type of the unique_ptr
; no extra runtime information is required.
This is in fact why non-capturing lambdas are stateless. In terms of the size penalty question, there is of course nothing special about non-capturing lambdas compared to any other stateless deletion functor type.
Note that std::function
is not stateless, which is why the same reasoning does not apply to it.
Finally, note that although stateless objects are typically required to have nonzero size in order to ensure that they have unique addresses, stateless base classes are not required to add to the total size of the derived type; this is called the empty base optimization. Thus unique_ptr
can be implemented (as in Bo Perrson's answer) as a type that derives from the deleter type, which, if it's stateless, will not contribute a size penalty. (This may in fact be the only way to correctly implement unique_ptr
without a size penalty for stateless deleters, but I'm not sure.)
A unique_ptr
must always store its deleter. Now, if the deleter is a class type with no state, then the unique_ptr
can make use of empty base optimization so that the deleter does not use any additional space.
How exactly this is done differs between implementations. For instance, both libc++ and MSVC store the managed pointer and the deleter in a compressed pair, which automatically gets you empty base optimization if one of the types involved is an empty class.
From the libc++ link above
template <class _Tp, class _Dp = default_delete<_Tp> >
class _LIBCPP_TYPE_VIS_ONLY unique_ptr
{
public:
typedef _Tp element_type;
typedef _Dp deleter_type;
typedef typename __pointer_type<_Tp, deleter_type>::type pointer;
private:
__compressed_pair<pointer, deleter_type> __ptr_;
libstdc++ stores the two in an std::tuple
and some Google searching suggests their tuple
implementation employs empty base optimization but I can't find any documentation stating so explicitly.
In any case, this example demonstrates that both libc++ and libstdc++ use EBO to reduce the size of a unique_ptr
with an empty deleter.
If the deleter is stateless there's no space required to store it. If the deleter is not stateless then the state needs to be stored in the unique_ptr
itself.
std::function
and function pointers have information that is only available at runtime and so that must be stored in the object alongside the pointer the object itself. This in turn requires allocating (in the unique_ptr
itself) space to store that extra state.
Perhaps understanding the Empty Base Optimization will help you understand how this could be implemented in practice.
The std::is_empty type trait is another possibility of how this could be implemented.
How exactly library writers implement this is obviously up to them and what the standard allows.