c++ 11 regex error

12,450

I just did a test using libc++ and clang++. This works as expected. Here's my main:

int main() {
    string test_str = "receipt freind theif receive";
    string pattern = "[a-zA-Z]*[^c]ei[a-zA-Z]*";

    try {
        regex r(pattern, regex_constants::extended);
        smatch results;

        if (regex_search(test_str, results, r))
            cout << results.str() << endl;
        else
            cout << "no match for " << pattern << endl;
    } catch (regex_error &e) {
        cout << "what: " << e.what() << "; code: " << parseCode(e.code()) << endl;
    }
}

Output:

freind

On the other hand GCC 4.7.2, gives this result:

no match for [a-zA-Z]*[^c]ei[a-zA-Z]*

This is because in GCC 4.7.2's libstdc++, they still don't implement regex. Here's the implementation of regex_search:

template<typename _Bi_iter, typename _Allocator, typename _Ch_type, typename _Rx_traits>
inline bool regex_search(_Bi_iter __first, _Bi_iter __last, match_results<_Bi_iter, _Allocator>& __m, const basic_regex<_Ch_type, _Rx_traits>& __re, regex_constants::match_flag_type __flags) {
    return false;
}

And just to note, it is very helpful to include a small program that readers could compile. That way there is no confusion about what code is being run.

Share:
12,450
zyy7259
Author by

zyy7259

Updated on June 04, 2022

Comments

  • zyy7259
    zyy7259 almost 2 years

    Just an example code from C++ Primer 5th Edition: 17.3.3. Using the Regular Expression Library

    Main file main.cpp:

    #include <iostream>
    #include "regexcase.h"
    using namespace std;
    
    int main() {
        using_regex();
        return 0;
    }
    

    Header file regexcase.h:

    #ifndef REGEXCASE_H_
    #define REGEXCASE_H_
    
    #include <regex>
    #include <string>
    
    void using_regex();
    std::string parseCode(std::regex_constants::error_type etype);
    
    #endif /* REGEXCASE_H_ */
    

    Source file regexcase.cpp:

    #include "regexcase.h"
    #include <iostream>
    using namespace std;
    
    void using_regex() {
        // look for words that violate a well-known spelling rule of thumb, "i before e, except after c":
        // find the characters ei that follow a character other than c
        string pattern("[^c]ei");
        // we want the whole word in which our pattern appears
        pattern = "[a-zA-Z]*" + pattern + "[a-zA-Z]*";  //[a-zA-Z]*   [[:alpha:]]*
        try {
            regex r(pattern, regex_constants::extended);    // construct a regex to find pattern                // , regex_constants::extended
            smatch results;     // define an object to hold the results of a search
            // define a string that has text that does and doesn't match pattern
            string test_str = "receipt freind theif receive";
            // use r to find a match to pattern in test_str
            if (regex_search(test_str, results, r)) // if there is a match
                cout << results.str() << endl;      // print the matching word
            else
                cout << "no match for " << pattern << endl;
        } catch (regex_error &e) {
            cout << "what: " << e.what() << "; code: " << parseCode(e.code()) << endl;
        }
    }
    
    string parseCode(regex_constants::error_type etype) {
        switch (etype) {
        case regex_constants::error_collate:
            return "error_collate: invalid collating element request";
        case regex_constants::error_ctype:
            return "error_ctype: invalid character class";
        case regex_constants::error_escape:
            return "error_escape: invalid escape character or trailing escape";
        case regex_constants::error_backref:
            return "error_backref: invalid back reference";
        case regex_constants::error_brack:
            return "error_brack: mismatched bracket([ or ])";
        case regex_constants::error_paren:
            return "error_paren: mismatched parentheses(( or ))";
        case regex_constants::error_brace:
            return "error_brace: mismatched brace({ or })";
        case regex_constants::error_badbrace:
            return "error_badbrace: invalid range inside a { }";
        case regex_constants::error_range:
            return "erro_range: invalid character range(e.g., [z-a])";
        case regex_constants::error_space:
            return "error_space: insufficient memory to handle this regular expression";
        case regex_constants::error_badrepeat:
            return "error_badrepeat: a repetition character (*, ?, +, or {) was not preceded by a valid regular expression";
        case regex_constants::error_complexity:
            return "error_complexity: the requested match is too complex";
        case regex_constants::error_stack:
            return "error_stack: insufficient memory to evaluate a match";
        default:
            return "";
        }
    }
    

    The output of calling using_regex(); is what: regex_error; code: error_brack: mismatched bracket([ or ])

    It seems that the regex can't parse the bracket.

    Refer to Answers in this question, I use regex_constants::extended to initialize the regex object, which then is regex r(pattern, regex_constants::extended);

    Then the output is no match for [[:alpha:]]*[^c]ei[[:alpha:]]*

    It seems that the regex can't match the pattern.

    Then I use [a-zA-Z]* to replace character class [[:alpha:]]* (with regex_constants::extended still set). The output still is no match for [a-zA-Z]*[^c]ei[a-zA-Z]*

    Platform: windows

    Tools used: Eclipse for C/C++; MinGW (g++ --version: g++ 4.7.2)

    EDIT: Thanks @sharth, add main file to complete the code.

  • zyy7259
    zyy7259 about 11 years
    So what shall I do? Use boost::regex instead, shall I? (as @Maxim Yegorushkin said above)
  • Bill Lynch
    Bill Lynch about 11 years
    Boost's implementation would work fine. In fact, I just did a quick test using Boost's 1.53.0 and GCC 4.7.2 and it worked the same as libc++.
  • Admin
    Admin over 10 years
    What version of clang? The above doesn't work on 3.3 rc2.
  • Bill Lynch
    Bill Lynch over 10 years
    @wvxvw: I was likely using whatever version of clang was included with OS X at the time. The larger issue, however, is that libstdc++ does not include support for regex. So, on OS X, you can do -stdlib=libc++ -std=c++11. On Linux, you will likely need to use boost's implementation.