Is make_shared really more efficient than new?

倖福魔咒の 提交于 2019-11-27 17:41:34

As infrastructure I was using llvm/clang 3.0 along with the llvm std c++ library within XCode4.

Well that appears to be your problem. The C++11 standard states the following requirements for make_shared<T> (and allocate_shared<T>), in section 20.7.2.2.6:

Requires: The expression ::new (pv) T(std::forward(args)...), where pv has type void* and points to storage suitable to hold an object of type T, shall be well formed. A shall be an allocator (17.6.3.5). The copy constructor and destructor of A shall not throw exceptions.

T is not required to be copy-constructable. Indeed, T isn't even required to be non-placement-new constructable. It is only required to be constructable in-place. This means that the only thing that make_shared<T> can do with T is new it in-place.

So the results you get are not consistent with the standard. LLVM's libc++ is broken in this regard. File a bug report.

For reference, here's what happened when I took your code into VC2010:

Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor
Destructor

I also ported it to Boost's original shared_ptr and make_shared, and I got the same thing as VC2010.

I'd suggest filing a bug report, as libc++'s behavior is broken.

You have to compare these two versions:

std::shared_ptr<Object> p1 = std::make_shared<Object>("foo");
std::shared_ptr<Object> p2(new Object("foo"));

In your code, the second variable is just a naked pointer, not a shared pointer at all.


Now on the meat. make_shared is (in practice) more efficient, because it allocates the reference control block together with the actual object in one single dynamic allocation. By contrast, the constructor for shared_ptr that takes a naked object pointer must allocate another dynamic variable for the reference count. The trade-off is that make_shared (or its cousin allocate_shared) does not allow you to specify a custom deleter, since the allocation is performed by the allocator.

(This does not affect the construction of the object itself. From Object's perspective there is no difference between the two versions. What's more efficient is the shared pointer itself, not the managed object.)

So one thing to keep in mind is your optimization settings. Measuring performance, particularly with regard to c++ is meaningless without optimizations enabled. I don't know if you did in fact compile with optimizations, so I thought it was worth mentioning.

That said, what you are measuring with this test is not a way that make_shared is more efficient. Simply put, you are measuring the wrong thing :-P.

Here's the deal. Normally, when you create shared pointer, it has at least 2 data members (possibly more). One for the pointer, and one for the reference count. This reference count is allocated on the heap (so that it can be shared among shared_ptr with different lifetimes...that's the point after all!)

So if you are creating an object with something like std::shared_ptr<Object> p2(new Object("foo")); There are at least 2 calls to new. One for Object and one for the reference count object.

make_shared has the option (i'm not sure it has to), to do a single new which is big enough to hold the object pointed to and the reference count in the same contiguous block. Effectively allocating an object that looks something like this (illustrative, not literally what it is).

struct T {
    int reference_count;
    Object object;
};

Since the reference count and the object's lifetimes are tied together (it doesn't make sense for one to live longer than the other). This whole block can be deleted at the same time as well.

So the efficiency is in allocations, not in copying (which I suspect had to do with optimization more than anything else).

To be clear, this is what boost has to say on about make_shared

http://www.boost.org/doc/libs/1_43_0/libs/smart_ptr/make_shared.html

Besides convenience and style, such a function is also exception safe and considerably faster because it can use a single allocation for both the object and its corresponding control block, eliminating a significant portion of shared_ptr's construction overhead. This eliminates one of the major efficiency complaints about shared_ptr.

You should not be getting any extra copies there. The output should be:

Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor

I don't know why you're getting extra copies. (though I see you're getting one 'Destructor' too many, so the code you used to get your output must be different from the code you posted)

make_shared is more efficient because it can be implemented using only one dynamic allocation instead of two, and because it needs one pointer's worth of memory less book-keeping per shared object.

Edit: I didn't check with Xcode 4.2 but with Xcode 4.3 I get the correct output I show above, not the incorrect output shown in the question.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!