Is make_shared really more efficient than new?

39,718

Solution 1

As infrastructure I was using llvm/clang 3.0 along with the llvm std c++ library within XCode4.

Well that appears to be your problem. The C++11 standard states the following requirements for make_shared<T> (and allocate_shared<T>), in section 20.7.2.2.6:

Requires: The expression ::new (pv) T(std::forward(args)...), where pv has type void* and points to storage suitable to hold an object of type T, shall be well formed. A shall be an allocator (17.6.3.5). The copy constructor and destructor of A shall not throw exceptions.

T is not required to be copy-constructable. Indeed, T isn't even required to be non-placement-new constructable. It is only required to be constructable in-place. This means that the only thing that make_shared<T> can do with T is new it in-place.

So the results you get are not consistent with the standard. LLVM's libc++ is broken in this regard. File a bug report.

For reference, here's what happened when I took your code into VC2010:

Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor
Destructor

I also ported it to Boost's original shared_ptr and make_shared, and I got the same thing as VC2010.

I'd suggest filing a bug report, as libc++'s behavior is broken.

Solution 2

You have to compare these two versions:

std::shared_ptr<Object> p1 = std::make_shared<Object>("foo");
std::shared_ptr<Object> p2(new Object("foo"));

In your code, the second variable is just a naked pointer, not a shared pointer at all.


Now on the meat. make_shared is (in practice) more efficient, because it allocates the reference control block together with the actual object in one single dynamic allocation. By contrast, the constructor for shared_ptr that takes a naked object pointer must allocate another dynamic variable for the reference count. The trade-off is that make_shared (or its cousin allocate_shared) does not allow you to specify a custom deleter, since the allocation is performed by the allocator.

(This does not affect the construction of the object itself. From Object's perspective there is no difference between the two versions. What's more efficient is the shared pointer itself, not the managed object.)

Solution 3

So one thing to keep in mind is your optimization settings. Measuring performance, particularly with regard to c++ is meaningless without optimizations enabled. I don't know if you did in fact compile with optimizations, so I thought it was worth mentioning.

That said, what you are measuring with this test is not a way that make_shared is more efficient. Simply put, you are measuring the wrong thing :-P.

Here's the deal. Normally, when you create shared pointer, it has at least 2 data members (possibly more). One for the pointer, and one for the reference count. This reference count is allocated on the heap (so that it can be shared among shared_ptr with different lifetimes...that's the point after all!)

So if you are creating an object with something like std::shared_ptr<Object> p2(new Object("foo")); There are at least 2 calls to new. One for Object and one for the reference count object.

make_shared has the option (i'm not sure it has to), to do a single new which is big enough to hold the object pointed to and the reference count in the same contiguous block. Effectively allocating an object that looks something like this (illustrative, not literally what it is).

struct T {
    int reference_count;
    Object object;
};

Since the reference count and the object's lifetimes are tied together (it doesn't make sense for one to live longer than the other). This whole block can be deleted at the same time as well.

So the efficiency is in allocations, not in copying (which I suspect had to do with optimization more than anything else).

To be clear, this is what boost has to say on about make_shared

http://www.boost.org/doc/libs/1_43_0/libs/smart_ptr/make_shared.html

Besides convenience and style, such a function is also exception safe and considerably faster because it can use a single allocation for both the object and its corresponding control block, eliminating a significant portion of shared_ptr's construction overhead. This eliminates one of the major efficiency complaints about shared_ptr.

Solution 4

You should not be getting any extra copies there. The output should be:

Create smart_ptr using make_shared...
Constructor make_shared
Create smart_ptr using make_shared: done.
Create smart_ptr using new...
Constructor new
Create smart_ptr using new: done.
Destructor

I don't know why you're getting extra copies. (though I see you're getting one 'Destructor' too many, so the code you used to get your output must be different from the code you posted)

make_shared is more efficient because it can be implemented using only one dynamic allocation instead of two, and because it needs one pointer's worth of memory less book-keeping per shared object.

Edit: I didn't check with Xcode 4.2 but with Xcode 4.3 I get the correct output I show above, not the incorrect output shown in the question.

Share:
39,718
Admin
Author by

Admin

Updated on July 09, 2022

Comments

  • Admin
    Admin almost 2 years

    I was experimenting with shared_ptr and make_shared from C++11 and programmed a little toy example to see what is actually happening when calling make_shared. As infrastructure I was using llvm/clang 3.0 along with the llvm std c++ library within XCode4.

    class Object
    {
    public:
        Object(const string& str)
        {
            cout << "Constructor " << str << endl;
        }
    
        Object()
        {
            cout << "Default constructor" << endl;
    
        }
    
        ~Object()
        {
            cout << "Destructor" << endl;
        }
    
        Object(const Object& rhs)
        {
            cout << "Copy constructor..." << endl;
        }
    };
    
    void make_shared_example()
    {
        cout << "Create smart_ptr using make_shared..." << endl;
        auto ptr_res1 = make_shared<Object>("make_shared");
        cout << "Create smart_ptr using make_shared: done." << endl;
    
        cout << "Create smart_ptr using new..." << endl;
        shared_ptr<Object> ptr_res2(new Object("new"));
        cout << "Create smart_ptr using new: done." << endl;
    }
    

    Now have a look at the output, please:

    Create smart_ptr using make_shared...

    Constructor make_shared

    Copy constructor...

    Copy constructor...

    Destructor

    Destructor

    Create smart_ptr using make_shared: done.

    Create smart_ptr using new...

    Constructor new

    Create smart_ptr using new: done.

    Destructor

    Destructor

    It appears that make_shared is calling the copy constructor two times. If I allocate memory for an Object using a regular new this does not happen, only one Object is constructed.

    What I am wondering about is the following. I heard that make_shared is supposed to be more efficient than using new(1, 2). One reason is because make_shared allocates the reference count together with the object to be managed in the same block of memory. OK, I got the point. This is of course more efficient than two separate allocation operations.

    On the contrary I don't understand why this has to come with the cost of two calls to the copy constructor of Object. Because of this I am not convinced that make_shared is more efficient than allocation using new in every case. Am I wrong here? Well OK, One could implement a move constructor for Object but still I am not sure whether this this is more efficient than just allocating Object through new. At least not in every case. It would be true if copying Object is less expensive than allocating memory for a reference counter. But the shared_ptr-internal reference counter could be implemented using a couple of primitive data types, right?

    Can you help and explain why make_shared is the way to go in terms of efficiency, despite the outlined copy overhead?

  • Andrew Durward
    Andrew Durward about 12 years
    "The results you get are entirely consistent with what the C++ standard allows." I don't see anything in the code that should cause the instance of Object to be copy/move constructed (regardless of whether or not the compiler opts to elide such a construction.)
  • Nicol Bolas
    Nicol Bolas about 12 years
    @AndrewDurward: Actually, you're right and wrong. The standard's requirements on make_shared<T> do not state that T must be copy constructable. Therefore, make_shared<T> cannot call the copy constructor. You're wrong in that if the standard did allow T to be copy constructable that an implementation of make_shared<T> could call it.
  • Howard Hinnant
    Howard Hinnant about 12 years
    @NicolBolas: Thanks for the bug report against libc++. I agree with your analysis. This has been fixed in the libc++ public svn trunk and the copy constructor is no longer called.
  • Howard Hinnant
    Howard Hinnant about 12 years
    Talk about good timing! ;-) Thanks for the Xcode 4.3 report.
  • Ela782
    Ela782 almost 11 years
    "make_shared is (in practice) more efficient, because it allocates the reference control block together with the actual object in one single dynamic allocation". I think that is only the case for VS2012, they do this optimization, but linux std-libs don't do that optimization (yet?).
  • Kerrek SB
    Kerrek SB almost 11 years
    @Ela782: Yes, GCC has done so for some time. This is explicitly recommended by the standard in 20.7.2.2.6/6.
  • dashesy
    dashesy over 8 years
    I have copy constructor implicitly deleted because I defined a user-declared move constructor. Now clang complains about make_shared call to implicitly-deleted copy constructor. So if make_shared does not need the copy constructor, is this a bug?
  • dashesy
    dashesy over 8 years
    I was passing a temporary to make_shared like std:make_shared(M(..)) changed it ot std:make_shared(std:move(M(..))) and it is good now.
  • yonil
    yonil over 8 years
    A minor nit, the Object instance can die before the refc control block i.e. if weak_ptrs are used. This isn't a problem, except the minor issue that memory directly held by Object's layout is not reclaimed until the control block dies too; in an ordinary shared_ptr the object's heap block can be reclaimed as soon as it expires.
  • choxsword
    choxsword about 6 years
    So this efficiency only happens in initialization of shared_ptr?Is there any efficiency in latter usage of ptr created by make_shared?
  • Kerrek SB
    Kerrek SB about 6 years
    @bigxiao: There shouldn't be. A good implementation will store the actual pointer to the object at the beginning of the shared_ptr regardless of how the shared_ptr was created, so that dereferencing never requires more computation than it would for a raw pointer. The final deallocation differs, of course (especially in the presence of weak pointers).
  • choxsword
    choxsword about 6 years
    @KerrekSB make_shared stores the object and ref countings together(in the same control block),so maybe there are less cache miss.
  • Kerrek SB
    Kerrek SB about 6 years
    @bigxiao: Sure, but the most important operation of a smart pointer, i.e. dereferencing, doesn't require access of the control block.
  • Andrew
    Andrew about 3 years
    "...the second variable is just a naked pointer, not a shared pointer at all." Wait what? Really? How can that be?