C++ Extract number from the middle of a string

59,427

Solution 1

updated for C++11

(important note for compiler regex support: for gcc. you need version 4.9 or later. i tested this on g++ version 4.9[1], and 9.2. cppreference.com has in browser compiler that i used.)

Thanks to user @2b-t who found a bug in the c++11 code!

Here is the C++11 code:

#include <iostream>
#include <string>
#include <regex>

using std::cout;
using std::endl;

int main() {
    std::string input = "Example_45-3";
    std::string output = std::regex_replace(
        input,
        std::regex("[^0-9]*([0-9]+).*"),
        std::string("$1")
        );
    cout << input << endl;
    cout << output << endl;
}

boost solution that only requires C++98

Minimal implementation example that works on many strings (not just strings of the form "text_45-text":

#include <iostream>
#include <string>
using namespace std;
#include <boost/regex.hpp>

int main() {
    string input = "Example_45-3";
    string output = boost::regex_replace(
        input,
        boost::regex("[^0-9]*([0-9]+).*"),
        string("\\1")
        );
    cout << input << endl;
    cout << output << endl;
}

console output:

Example_45-3
45

Other example strings that this would work on:

  • "asdfasdf 45 sdfsdf"
  • "X = 45, sdfsdf"

For this example I used g++ on Linux with #include <boost/regex.hpp> and -lboost_regex. You could also use C++11x regex.

Feel free to edit my solution if you have a better regex.


Commentary:

If there aren't performance constraints, using Regex is ideal for this sort of thing because you aren't reinventing the wheel (by writing a bunch of string parsing code which takes time to write/test-fully).

Additionally if/when your strings become more complex or have more varied patterns regex easily accommodates the complexity. (The question's example pattern is easy enough. But often times a more complex pattern would take 10-100+ lines of code when a one line regex would do the same.)


[1]

[1] Apparently full support for C++11 <regex> was implemented and released for g++ version 4.9.x and on Jun 26, 2015. Hat tip to SO questions #1 and #2 for figuring out the compiler version needing to be 4.9.x.

Solution 2

You can also use the built in find_first_of and find_first_not_of to find the first "numberstring" in any string.

    std::string first_numberstring(std::string const & str)
    {
      char const* digits = "0123456789";
      std::size_t const n = str.find_first_of(digits);
      if (n != std::string::npos)
      {
        std::size_t const m = str.find_first_not_of(digits, n);
        return str.substr(n, m != std::string::npos ? m-n : m);
      }
      return std::string();
    }

Solution 3

This should be more efficient than Ashot Khachatryan's solution. Note the use of '_' and '-' instead of "_" and "-". And also, the starting position of the search for '-'.

inline std::string mid_num_str(const std::string& s) {
    std::string::size_type p  = s.find('_');
    std::string::size_type pp = s.find('-', p + 2); 
    return s.substr(p + 1, pp - p - 1);
}

If you need a number instead of a string, like what Alexandr Lapenkov's solution has done, you may also want to try the following:

inline long mid_num(const std::string& s) {
    return std::strtol(&s[s.find('_') + 1], nullptr, 10);
}

Solution 4

Check this out

std::string ex = "Example_45-3";
int num;
sscanf( ex.c_str(), "%*[^_]_%d", &num );

Solution 5

I can think of two ways of doing it:

  • Use regular expressions
  • Use an iterator to step through the string, and copy each consecutive digit to a temporary buffer. Break when it reaches an unreasonable length or on the first non-digit after a string of consecutive digits. Then you have a string of digits that you can easily convert.
Share:
59,427
fakeaccount
Author by

fakeaccount

FakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFakeFake

Updated on April 28, 2020

Comments

  • fakeaccount
    fakeaccount about 4 years

    I have a vector containing strings that follow the format of text_number-number

    Eg: Example_45-3

    I only want the first number (45 in the example) and nothing else which I am able to do with my current code:

    std::vector<std::string> imgNumStrVec;
    for(size_t i = 0; i < StrVec.size(); i++){
        std::vector<std::string> seglist;
        std::stringstream ss(StrVec[i]);
        std::string seg, seg2;
        while(std::getline(ss, seg, '_')) seglist.push_back(seg);
        std::stringstream ss2(seglist[1]);
        std::getline(ss2, seg2, '-');
        imgNumStrVec.push_back(seg2); 
    }
    

    Are there more streamlined and simpler ways of doing this? and if so what are they?

    I ask purely out of desire to learn how to code better as at the end of the day, the code above does successfully extract just the first number, but it seems long winded and round-about.