C++ - Split string by regex
Solution 1
You don't need to use regular expressions if you just want to split a string by multiple spaces. Writing your own regex library is overkill for something that simple.
The answer you linked to in your comments, Split a string in C++?, can easily be changed so that it doesn't include any empty elements if there are multiple spaces.
std::vector<std::string> &split(const std::string &s, char delim,std::vector<std::string> &elems) {
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim)) {
if (item.length() > 0) {
elems.push_back(item);
}
}
return elems;
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
split(s, delim, elems);
return elems;
}
By checking that item.length() > 0
before pushing item
on to the elems
vector you will no longer get extra elements if your input contains multiple delimiters (spaces in your case)
Solution 2
#include <regex>
std::regex rgx("\\s+");
std::sregex_token_iterator iter(string_to_split.begin(),
string_to_split.end(),
rgx,
-1);
std::sregex_token_iterator end;
for ( ; iter != end; ++iter)
std::cout << *iter << '\n';
The -1
is the key here: when the iterator is constructed the iterator points at the text that precedes the match and after each increment the iterator points at the text that followed the previous match.
If you don't have C++11, the same thing should work with TR1 or (possibly with slight modification) with Boost.
Solution 3
To expand on the answer by @Pete Becker I provide an example of resplit function that can be used to split text using regexp:
#include <regex>
std::vector<std::string> resplit(const std::string &s, const std::regex &sep_regex = std::regex{"\\s+"}) {
std::sregex_token_iterator iter(s.begin(), s.end(), sep_regex, -1);
std::sregex_token_iterator end;
return {iter, end};
}
This works as follows:
string s1 = "first second third ";
vector<string> v22 = resplit(s1);
for (const auto & e: v22) {
cout <<"Token:" << e << endl;
}
//Token:first
//Token:second
//Token:third
string s222 = "first|second:third,forth";
vector<string> v222 = resplit(s222, "[|:,]");
for (const auto & e: v222) {
cout <<"Token:" << e << endl;
}
//Token:first
//Token:second
//Token:third
//Token:forth
Solution 4
string s = "foo bar baz";
regex e("\\s+");
regex_token_iterator<string::iterator> i(s.begin(), s.end(), e, -1);
regex_token_iterator<string::iterator> end;
while (i != end)
cout << " [" << *i++ << "]";
prints [foo] [bar] [baz]
Related videos on Youtube
nothing-special-here
Maciej Kowalski Freelance - Ruby / JRuby / Rails / Backbone / AngularJS / Ember.js
Updated on July 09, 2022Comments
-
nothing-special-here almost 2 years
I want to split
std::string
byregex
.I have found some solutions on Stackoverflow, but most of them are splitting string by single space or using external libraries like boost.
I can't use boost.
I want to split string by regex -
"\\s+"
.I am using this g++ version
g++ (Debian 4.4.5-8) 4.4.5
and i can't upgrade.-
nothing-special-here almost 11 yearsRight know I am using this functions to split string: stackoverflow.com/a/236803/418518 it works only by single char. The regex format is correct, I have already used him in one java project. Works brillant.
-
nothing-special-here almost 11 yearsThe problem is that I don't know C++ much... and I just want to know how to split
std::string
using old c++ standard (C++03
probably). If you have some links / code just paste it. :) Thanks! -
melwil almost 11 yearsCan you show example input and desired output?
-
Bernhard Barker almost 11 yearsUsing boost may be an option.
-
nothing-special-here almost 11 years@melwil: Desired input / output: gist.github.com/maciejkowalski/af7e0ce2b92d967e050c
-
nothing-special-here almost 11 years@Dukeling: Unfortunatelly, I can't use boost. ;/
-
Bernhard Barker almost 11 yearsIf that version of g++ C++11 compliant, this / this may be a starting point. Otherwise, splitting by regex pattern without an external library will probably require writing a regex parser (which is no small task, or a small copy-paste task, assuming you can find code to do it). However, if you just want to split by multiple spaces, a simple iterative solution probably won't be too difficult, or simply split by a single space and ignore empty strings.
-
n. m. almost 11 yearsC++03 does not come with a regex library. C++11 does but your compiler won't support C++11. You need to either use an existing third-party regex library, or write one of your own.
-
-
nothing-special-here almost 11 yearsWell, we figured out the same way in the same time. :) But you were actually faster (~10 min) in pasting answer on SO. +1 & accept.
-
Pete Becker about 9 years@Narek - either that, or add explicit template arguments:
regex_token_iterator<std::string::iterator>
.sregex_token_iterator
is easier. Fixed. Thanks. -
Lu4 almost 9 yearsYou should agree also on fact that using C++ to split string looks like even larger overkill, in C# you just do
str.split(...)
;) -
solstice333 over 7 yearsthe last example on cplusplus.com reference doc is similar to this answer