merge vector into existing vector
Solution 1
I would at least try:
std::vector<T> tmp;
tmp.reserve(src.size() + dst.size()); // commenters are probably right about this
std::merge(src.begin(), src.end(), dst.begin(), dst.end(), std::back_inserter(tmp));
src.swap(tmp);
But I suspect that much depends on the nature of T
, the size of src
and dst
, and why we need to optimize.
Solution 2
It can be done more efficiently if T is heavy to copy and your compiler supports C++0x.
#include <iterator> // for make_move_iterator
size_t n = dst.size();
dst.insert(dst.end(),
std::make_move_iterator(src.begin()),
std::make_move_iterator(src.end()));
std::inplace_merge(dst.begin(), dst.begin() + n, dst.end());
Using make_move_iterator()
will cause insert()
to move the contents of src
into dst
instead of copying them.
Update:
You're dealing with POD types and you're already resizing/copying everything in the dst
vector in the likely case that insert()
overflows the reserve, so it could be faster to just use std::merge()
into a new vector
. This would avoid that initial copy AND have a better worst-case complexity:
inplace_merge()
has a best case of O(n) complexity, but degrades into a worst-case O(n log n) depending on your data.
merge()
has a worst-case O(n) so it is guaranteed to be at least as fast, potentially much faster. It also has move optimization built-in.
zwol
If you want to know about me, please see my website: https://www.owlfolio.org/ . DO NOT CONTACT ME WITH ANY SORT OF JOB OFFER. The policy of treating comment threads as ephemeral and "not for extended discussion" is wrong, is actively harmful to the community, and I will not cooperate with it in any way. Using people's preferred pronouns is basic courtesy and it is reasonable for the code of conduct to insist on it. (I go by "he", for the record.)
Updated on July 18, 2022Comments
-
zwol almost 2 years
In C++, given
vector<T> src, dst
, both already sorted, is there a more efficient way to merge the contents ofsrc
intodst
thansize_t n = dst.size(); dst.insert(dst.end(), src.begin(), src.end()); std::inplace_merge(dst.begin(), dst.begin() + n, dst.end());
? In the case I care about,
T
is a small (12-16 bytes, depending on ABI) POD structure, but each vector contains millions of elements, so the total amount of memory in play is tens to hundreds of megabytes. -
Oliver Charlesworth almost 13 yearsTo avoid growing
tmp
with aback_inserter
, why not construct it asstd::vector<T> tmp(src.size() + dst.size())
? Or usereserve
? -
pmr almost 13 years@Oli, I'd rather use
reserve
in case default construction is expensive. -
Kerrek SB almost 13 yearsThat would construct lots of elements -- why not just
reserve
the vector before the merging? -
Karl Knechtel almost 13 yearsSure,
.reserve()
is a good idea here... edited accordingly. -
Kerrek SB almost 13 yearsIt should be
dst.end()
in the final line, and you can probably maken
const. -
Steve Jessop almost 13 yearsAnd if T is heavy to copy but your compiler doesn't support move semantics, you could resize dst and then swap each element of src into place. Obviously T needs an optimized swap. And I've never used
inplace_merge
, but it looks as though it does a lot of assignment and/or copying, which might kill performance anyway in the absence of move. -
zwol almost 13 yearsIn the case I care about, T is a small (12 bytes, possibly padded to 16 in a
vector
) POD structure, so it seems to me that copy and move will both degenerate tomemcpy
. -
Steve Jessop almost 13 yearsAnd if
T
is a nasty type, expensive to copy, then you could unpackmerge
and change it to default-construct and then swap from the orginal, rather than copying. C++0x has apush_back(T &&)
, but that won't be used without help since the thing being pushed isn't necessarily disposable, so combine this answer with Cory's. -
zwol almost 13 yearsI might try this, but the vectors in question are potentially very large (each element is small, but there could be millions) so I expect that would wind up hurting more than it helps.
-
Cory Nelson almost 13 yearsYes. In that case you can still try the
merge
suggestion in the other answer. He doesn't say it, butmerge
can actually be more efficient depending on your data:inplace_merge
is O(n) in the best case, and O(n log n) in the worst.merge
is always O(n). -
zwol almost 13 yearsThe vectors potentially have millions of elements, I don't think copying everything to a new vector is a good idea.
-
Steve Jessop almost 13 years@Zack: you're going to copy (nearly) everything anyway, unless you're really lucky and (nearly) all of
src
is greater than (nearly) all ofdst
. The difference is thatinplace_merge
(usually) copies everything more than once, and (if necessary) uses less RAM as working space. -
zwol over 12 yearsMonths later: this was, in fact, an improvement over what I had, although not as much as I was hoping. An improvement of the magnitude I wanted probably requires a smarter algorithm, and I'm no longer working on that particular problem, so this'll do for the accepted answer.
-
Admin almost 11 years@KarlKnechtel Why it give me error on
std::back_inserter(tmp));
as i am using it with string ,std::vector<string> tmp;
,Error : std ahs member back_inserter
-
Karl Knechtel almost 11 yearsIt comes from
<iterator>
.