How to implode a vector of strings into a string (the elegant way)
Solution 1
Use boost::algorithm::join(..)
:
#include <boost/algorithm/string/join.hpp>
...
std::string joinedString = boost::algorithm::join(elems, delim);
See also this question.
Solution 2
std::vector<std::string> strings;
const char* const delim = ", ";
std::ostringstream imploded;
std::copy(strings.begin(), strings.end(),
std::ostream_iterator<std::string>(imploded, delim));
(include <string>
, <vector>
, <sstream>
and <iterator>
)
If you want to have a clean end (no trailing delimiter) have a look here
Solution 3
You should use std::ostringstream
rather than std::string
to build the output (then you can call its str()
method at the end to get a string, so your interface need not change, only the temporary s
).
From there, you could change to using std::ostream_iterator
, like so:
copy(elems.begin(), elems.end(), ostream_iterator<string>(s, delim));
But this has two problems:
-
delim
now needs to be aconst char*
, rather than a singlechar
. No big deal. -
std::ostream_iterator
writes the delimiter after every single element, including the last. So you'd either need to erase the last one at the end, or write your own version of the iterator which doesn't have this annoyance. It'd be worth doing the latter if you have a lot of code that needs things like this; otherwise the whole mess might be best avoided (i.e. useostringstream
but notostream_iterator
).
Solution 4
Because I love one-liners (they are very useful for all kinds of weird stuff, as you'll see at the end), here's a solution using std::accumulate and C++11 lambda:
std::accumulate(alist.begin(), alist.end(), std::string(),
[](const std::string& a, const std::string& b) -> std::string {
return a + (a.length() > 0 ? "," : "") + b;
} )
I find this syntax useful with stream operator, where I don't want to have all kinds of weird logic out of scope from the stream operation, just to do a simple string join. Consider for example this return statement from method that formats a string using stream operators (using std;):
return (dynamic_cast<ostringstream&>(ostringstream()
<< "List content: " << endl
<< std::accumulate(alist.begin(), alist.end(), std::string(),
[](const std::string& a, const std::string& b) -> std::string {
return a + (a.length() > 0 ? "," : "") + b;
} ) << endl
<< "Maybe some more stuff" << endl
)).str();
Update:
As pointed out by @plexando in the comments, the above code suffers from misbehavior when the array starts with empty strings due to the fact that the check for "first run" is missing previous runs that have resulted in no additional characters, and also - it is weird to run a check for "is first run" on all runs (i.e. the code is under-optimized).
The solution for both of these problems is easy if we know for a fact that the list has at least one element. OTOH, if we know for a fact that the list does not have at least one element, then we can shorten the run even more.
I think the resulting code isn't as pretty, so I'm adding it here as The Correct Solution, but I think the discussion above still has merrit:
alist.empty() ? "" : /* leave early if there are no items in the list */
std::accumulate( /* otherwise, accumulate */
++alist.begin(), alist.end(), /* the range 2nd to after-last */
*alist.begin(), /* and start accumulating with the first item */
[](auto& a, auto& b) { return a + "," + b; });
Notes:
- For containers that support direct access to the first element, its probably better to use that for the third argument instead, so
alist[0]
for vectors. - As per the discussion in the comments and chat, the lambda still does some copying. This can be minimized by using this (less pretty) lambda instead:
[](auto&& a, auto&& b) -> auto& { a += ','; a += b; return a; })
which (on GCC 10) improves performance by more than x10. Thanks to @Deduplicator for the suggestion. I'm still trying to figure out what is going on here.
Solution 5
I like to use this one-liner accumulate (no trailing delimiter):
std::accumulate(
std::next(elems.begin()),
elems.end(),
elems[0],
[](std::string a, std::string b) {
return a + delimiter + b;
}
);
ezpresso
Updated on December 27, 2021Comments
-
ezpresso over 2 years
I'm looking for the most elegant way to implode a vector of strings into a string. Below is the solution I'm using now:
static std::string& implode(const std::vector<std::string>& elems, char delim, std::string& s) { for (std::vector<std::string>::const_iterator ii = elems.begin(); ii != elems.end(); ++ii) { s += (*ii); if ( ii + 1 != elems.end() ) { s += delim; } } return s; } static std::string implode(const std::vector<std::string>& elems, char delim) { std::string s; return implode(elems, delim, s); }
Is there any others out there?
-
Michael Krelin - hacker about 13 yearskeep in mind, though, that it will add extra delimiter (the second parameter to the
std::ostream_iterator
constructor at the end of the stream. -
Jerry Coffin about 13 yearsOr use one that's already written: stackoverflow.com/questions/3496982/…
-
Oktalist over 10 yearsDon't use
accumulate
for strings. Most of the other answers are O(n) butaccumulate
is O(n^2) because it makes a temporary copy of the accumulator before appending each element. And no, move semantics don't help. -
Guss over 10 years@Oktalist, I'm not sure why you say that - cplusplus.com/reference/numeric/accumulate says "Complexity is linear in the distance between first and last".
-
Oktalist over 10 yearsThat's assuming that each individual addition takes constant time. If
T
has an overloadedoperator+
(likestring
does) or if you provide your own functor then all bets are off. Although I may have been hasty in saying move semantics don't help, they don't solve the problem in the two implementations that I've checked. See my answers to similar questions. -
Oktalist over 10 yearsskwllsp's comment is nothing to do with it. Like I said, most of the other answers (and the OP's
implode
example) are doing the right thing. They are O(n), even if they don't callreserve
on the string. Only the solution using accumulate is O(n^2). No need for C-style code. -
Guss over 10 yearsI've reviewed all the answers as well as the OPs code sample, and all of them except for the one doing reserve (with a hard-coded value which we hope is large enough) are re-copying all the so far accumulated data on every iteration (std::string normally allocates more memory than it actually needs for appending, so appending delim will likely not incur another copy). The main difference between those implementations and std::accumulate based is the additional "return a copy" from the lambda, for which move semantics will prevent a large copy operation.
-
Oktalist over 10 yearsOn every iteration? So you're saying they're all O(n^2)? Not on my machine. They don't copy on every iteration for the same reason that they don't copy when appending delim,
string
append is usually amortized O(1) because it allocates much more than it needs. Here's a quick benchmark: codepad.org/mfubiiMg -
kirbyfan64sos about 9 yearsI did a benchmark, and accumulate was actually faster than an O(n) string stream.
-
Jonny over 8 yearsThe point of "implode" is that a delimiter should not be added last. This answer unfortunately adds that delimiter last.
-
Julian over 6 yearsSuggesting to include and link against the massive boost library to create a simple string is absurd.
-
River Tam over 6 years@Julian most projects already do this. I agree that it is absurd that the STL does not include a way to do this already, however. I might also agree that this should not be the top answer, but other answers are clearly available.
-
Azeroth2b over 5 yearsI concur with @Julian. Boost may be elegant in use but is no way the "most elegant way" in terms of overhead. In this case, it's a workaround to the OP's algorithm and not a resolution to the question itself.
-
jbruni about 5 yearsMost Boost libraries are header-only, so there is nothing to link. Some even make their way into the standard.
-
Carlos Pinzón about 4 yearsBe careful when it's empty.
-
plexando over 3 yearsA questionable algorithmic design. You know upfront that a comma needs to be prepended to every element except the first one, then why do you do all those useless runtime checks? In addition to that, your code skips all empty pieces until the first non-empty one, then thereafter, empty pieces are not skipped anymore but appended.
-
Guss over 3 years@plexando - you are absolutely correct. I wasn't sure about the first part of your comment, but as I checked out the skipping empty strings issue I figured out a solution that solves both problems. The result, while more resilient, is less pretty so I'll just add it to the answer as an update. Thank you for pointing at the problem! 👍🙇♂️
-
Deduplicator over 3 yearsThe last one is bad: It doesn't take advantage of
std::move()
instd::accumulate()
, resulting in at least one allocation per element added, more likely two. The lambda should be changed to[](auto&& a, auto&& b){ a += ","; a += b; return decltype(a)(a); }
. That is, if you don't want to change it all tostd::accumulate(alist.begin(), alist.end(), std::string(), [bool f = false](auto&& a, auto&& b) mutable { if (f) a += ","; f = true; a += b; return decltype(a)(a); })
and let the compiler lower the flag to code. -
Guss over 3 years@Deduplicator Why do you think
return a + "," + b;
doesn't use move semantics (while my previous code did) and will changing it toreturn (a += ",") += b;
help? -
Deduplicator over 3 yearsIt doesn't matter whether
return a + "," + b;
uses move-semantics for the return itself, as it must perforce create a new object divorced from botha
andb
from scratch, as neither are xvalues there. Actually, the temporary is subject to RVO. Using op+= doesn't help, as that isn't overloaded for xvalue this. -
Guss over 3 years@Deduplicator, after reading en.cppreference.com/w/cpp/algorithm/accumulate, I'm still not sure I understand how you can tell when the lambda return value can be moved or not, but I'm pretty sure their examples mean that
a
is an xvalue. I did ran some benchmarks (w/ gcc 10) and while your suggestions of using forwarding references,mutable
,decltype
or the capturedbool
have no effect, using two+=
statements instead of three add terms does improve performance by ~20% (!!). Interestingly, so does just writingreturn a + ("," + b);
- so I guess we know what the problem is. -
Deduplicator over 3 yearsHave you tried making the arguments forward-references and using the first as
decltype(a)(a)
? -
Guss over 3 yearsLet us continue this discussion in chat.
-
Константин Ван over 3 yearsAnd fortunately I need to add the token last as well! Thanks for the solution.
-
Kiruahxh about 3 yearsNot having this basic feature in stdlib is absurd.
-
Jason C about 3 yearsYou'd never need to check every time though. Just add the first element outside the loop, and start the loop at the second...
-
Jason C about 3 years
-
xtofl about 3 yearsI hope the compiler is smart enough to remove the check for the
ret
to be empty on every iteration. -
xtofl about 3 yearsI don't see why you'd add that. There isn't a loop in this function, and
accumulate
does receive the first element and is told to start at the second... -
Jason C about 3 yearsWhat I mean is: "Especially with bigger collections, you want to avoid having to check if youre still adding the first element or not to ensure no trailing separator." -- You can avoid having to check this in the loop method that your statement refers to by pulling the first element out of the loop. Sorry, I was kinda vague; I was commenting on the premise, not on the solution. The solution you provided is perfectly fine.
-
xtofl about 3 yearsI share your idea. Related: stackoverflow.com/questions/156650/….
-
c z over 2 years@xtofl You're overestimating the cost of that check.
-
xtofl over 2 yearsI'm just following the "Don't pay for what you don't use" rule. You're right in most cases, where
lst
is not enormous. -
Bruce Nielsen about 2 yearsBesides the fact that
ret.empty()
is a trivial check, this is a use case that is perfect for branch predictors, since after the first test it will always evaluate to false. -
Moberg almost 2 years@kirbyfan64sos was the gist removed?
-
kirbyfan64sos almost 2 years@Moberg I changed my github username, new URL is gist.github.com/refi64/e69b605a287cedcd57f5