Is make_shared really more efficient than new?
Solution 1
As infrastructure I was using llvm/clang 3.0 along with the llvm std c++ library within XCode4.
Well that appears to be your problem. The C++11 standard states the following requirements for make_shared<T>
(and allocate_shared<T>
), in section 20.7.2.2.6:
Requires: The expression ::new (pv) T(std::forward(args)...), where pv has type void* and points to storage suitable to hold an object of type T, shall be well formed. A shall be an allocator (17.6.3.5). The copy constructor and destructor of A shall not throw exceptions.
T
is not required to be copy-constructable. Indeed, T
isn't even required to be non-placement-new constructable. It is only required to be constructable in-place. This means that the only thing that make_shared<T>
can do with T
is new
it in-place.
So the results you get are not consistent with the standard. LLVM's libc++ is broken in this regard. File a bug report.
For reference, here's what happened when I took your code into VC2010:
Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor
Destructor
I also ported it to Boost's original shared_ptr
and make_shared
, and I got the same thing as VC2010.
I'd suggest filing a bug report, as libc++'s behavior is broken.
Solution 2
You have to compare these two versions:
std::shared_ptr<Object> p1 = std::make_shared<Object>("foo");
std::shared_ptr<Object> p2(new Object("foo"));
In your code, the second variable is just a naked pointer, not a shared pointer at all.
Now on the meat. make_shared
is (in practice) more efficient, because it allocates the reference control block together with the actual object in one single dynamic allocation. By contrast, the constructor for shared_ptr
that takes a naked object pointer must allocate another dynamic variable for the reference count. The trade-off is that make_shared
(or its cousin allocate_shared
) does not allow you to specify a custom deleter, since the allocation is performed by the allocator.
(This does not affect the construction of the object itself. From Object
's perspective there is no difference between the two versions. What's more efficient is the shared pointer itself, not the managed object.)
Solution 3
So one thing to keep in mind is your optimization settings. Measuring performance, particularly with regard to c++ is meaningless without optimizations enabled. I don't know if you did in fact compile with optimizations, so I thought it was worth mentioning.
That said, what you are measuring with this test is not a way that make_shared
is more efficient. Simply put, you are measuring the wrong thing :-P.
Here's the deal. Normally, when you create shared pointer, it has at least 2 data members (possibly more). One for the pointer, and one for the reference count. This reference count is allocated on the heap (so that it can be shared among shared_ptr
with different lifetimes...that's the point after all!)
So if you are creating an object with something like std::shared_ptr<Object> p2(new Object("foo"));
There are at least 2 calls to new
. One for Object
and one for the reference count object.
make_shared
has the option (i'm not sure it has to), to do a single new
which is big enough to hold the object pointed to and the reference count in the same contiguous block. Effectively allocating an object that looks something like this (illustrative, not literally what it is).
struct T {
int reference_count;
Object object;
};
Since the reference count and the object's lifetimes are tied together (it doesn't make sense for one to live longer than the other). This whole block can be delete
d at the same time as well.
So the efficiency is in allocations, not in copying (which I suspect had to do with optimization more than anything else).
To be clear, this is what boost has to say on about make_shared
http://www.boost.org/doc/libs/1_43_0/libs/smart_ptr/make_shared.html
Besides convenience and style, such a function is also exception safe and considerably faster because it can use a single allocation for both the object and its corresponding control block, eliminating a significant portion of shared_ptr's construction overhead. This eliminates one of the major efficiency complaints about shared_ptr.
Solution 4
You should not be getting any extra copies there. The output should be:
Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor
I don't know why you're getting extra copies. (though I see you're getting one 'Destructor' too many, so the code you used to get your output must be different from the code you posted)
make_shared
is more efficient because it can be implemented using only one dynamic allocation instead of two, and because it needs one pointer's worth of memory less book-keeping per shared object.
Edit: I didn't check with Xcode 4.2 but with Xcode 4.3 I get the correct output I show above, not the incorrect output shown in the question.
Admin
Updated on July 09, 2022Comments
-
Admin almost 2 years
I was experimenting with
shared_ptr
andmake_shared
from C++11 and programmed a little toy example to see what is actually happening when callingmake_shared
. As infrastructure I was using llvm/clang 3.0 along with the llvm std c++ library within XCode4.class Object { public: Object(const string& str) { cout << "Constructor " << str << endl; } Object() { cout << "Default constructor" << endl; } ~Object() { cout << "Destructor" << endl; } Object(const Object& rhs) { cout << "Copy constructor..." << endl; } }; void make_shared_example() { cout << "Create smart_ptr using make_shared..." << endl; auto ptr_res1 = make_shared<Object>("make_shared"); cout << "Create smart_ptr using make_shared: done." << endl; cout << "Create smart_ptr using new..." << endl; shared_ptr<Object> ptr_res2(new Object("new")); cout << "Create smart_ptr using new: done." << endl; }
Now have a look at the output, please:
Create smart_ptr using make_shared...
Constructor make_shared
Copy constructor...
Copy constructor...
Destructor
Destructor
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor
Destructor
It appears that
make_shared
is calling the copy constructor two times. If I allocate memory for anObject
using a regularnew
this does not happen, only oneObject
is constructed.What I am wondering about is the following. I heard that
make_shared
is supposed to be more efficient than usingnew
(1, 2). One reason is becausemake_shared
allocates the reference count together with the object to be managed in the same block of memory. OK, I got the point. This is of course more efficient than two separate allocation operations.On the contrary I don't understand why this has to come with the cost of two calls to the copy constructor of
Object
. Because of this I am not convinced thatmake_shared
is more efficient than allocation usingnew
in every case. Am I wrong here? Well OK, One could implement a move constructor forObject
but still I am not sure whether this this is more efficient than just allocatingObject
throughnew
. At least not in every case. It would be true if copyingObject
is less expensive than allocating memory for a reference counter. But theshared_ptr
-internal reference counter could be implemented using a couple of primitive data types, right?Can you help and explain why
make_shared
is the way to go in terms of efficiency, despite the outlined copy overhead? -
Andrew Durward about 12 years"The results you get are entirely consistent with what the C++ standard allows." I don't see anything in the code that should cause the instance of
Object
to be copy/move constructed (regardless of whether or not the compiler opts to elide such a construction.) -
Nicol Bolas about 12 years@AndrewDurward: Actually, you're right and wrong. The standard's requirements on
make_shared<T>
do not state thatT
must be copy constructable. Therefore,make_shared<T>
cannot call the copy constructor. You're wrong in that if the standard did allowT
to be copy constructable that an implementation ofmake_shared<T>
could call it. -
Howard Hinnant about 12 years@NicolBolas: Thanks for the bug report against libc++. I agree with your analysis. This has been fixed in the libc++ public svn trunk and the copy constructor is no longer called.
-
Howard Hinnant about 12 yearsTalk about good timing! ;-) Thanks for the Xcode 4.3 report.
-
Ela782 almost 11 years"make_shared is (in practice) more efficient, because it allocates the reference control block together with the actual object in one single dynamic allocation". I think that is only the case for VS2012, they do this optimization, but linux std-libs don't do that optimization (yet?).
-
Kerrek SB almost 11 years@Ela782: Yes, GCC has done so for some time. This is explicitly recommended by the standard in 20.7.2.2.6/6.
-
dashesy over 8 yearsI have copy constructor implicitly deleted because I defined a user-declared move constructor. Now clang complains about
make_shared
call to implicitly-deleted copy constructor. So ifmake_shared
does not need the copy constructor, is this a bug? -
dashesy over 8 yearsI was passing a temporary to
make_shared
likestd:make_shared(M(..))
changed it otstd:make_shared(std:move(M(..)))
and it is good now. -
yonil over 8 yearsA minor nit, the Object instance can die before the refc control block i.e. if weak_ptrs are used. This isn't a problem, except the minor issue that memory directly held by Object's layout is not reclaimed until the control block dies too; in an ordinary shared_ptr the object's heap block can be reclaimed as soon as it expires.
-
choxsword about 6 yearsSo this efficiency only happens in initialization of shared_ptr?Is there any efficiency in latter usage of ptr created by make_shared?
-
Kerrek SB about 6 years@bigxiao: There shouldn't be. A good implementation will store the actual pointer to the object at the beginning of the shared_ptr regardless of how the shared_ptr was created, so that dereferencing never requires more computation than it would for a raw pointer. The final deallocation differs, of course (especially in the presence of weak pointers).
-
choxsword about 6 years@KerrekSB
make_shared
stores the object and ref countings together(in the same control block),so maybe there are less cache miss. -
Kerrek SB about 6 years@bigxiao: Sure, but the most important operation of a smart pointer, i.e. dereferencing, doesn't require access of the control block.
-
Andrew about 3 years"...the second variable is just a naked pointer, not a shared pointer at all." Wait what? Really? How can that be?