Simple wildcard search algorithm in C++

11,343

Solution 1

Here is the function I came up with. Finally I found a way using limited knowledge of mine. It works, but probably performance is really bad.

Thanks for your all help, they inspired me even though I can't directly use them because of advanced-for me- techniques.

void wildcard(string source, string search)
{
    unsigned int j = 0, i = 0, z = 0;
    string s1 = "", search2 = search;
    //Starting with a null string and adding found parts to it

    /*************************IF IT STARTS WITH QUESTION MARK*************************/

    if(search.find('?') == 0)
    {
        for(; search.at(z) == '?'; z++)
            //loop make search string start without question marks.
        {
            search2 = search.substr(z + 1, search.length());
        }

        for(; j <= source.length()-search2.length(); ++j)
            //application of Brute Force Search Algoritm for this case.
        {
            while(i < search2.length() && (source.at(z+i+j) == search2.at(i) || search2.at(i) == '?'))
            {
                s1 = s1 + source.at(z+j+i);
                i++;
            }
        }

        if(s1.length() == search2.length())
            //showing results for this case.
        {
            cout << "The matched string was found at index: " << source.find(s1) - z << endl;
            cout << "The matched string is: " << source.substr((source.find(s1)-z), search.length()) << endl << endl;
        }
        else
        {
            cout << "The search string could not found in the source string." << endl << endl;
        }
    }

    /********************IF IT DOES NOT START WITH QUESTION MARK**********************/

    else
        //If it doesnot start with ?, use normal test.
    {
        for(; j <= source.length()-search.length(); ++j)
            //application of Brute Force Search Algoritm for this case.
        {
            while(i < search.length() && (source.at(i+j) == search.at(i) || search.at(i) == '?'))
            {
                s1 = s1 + source.at(j+i);
                i++;
            }
        }

        if(s1.length() == search.length())
            //results
        {
            cout << "The matched string was found at index: " << source.find(s1) << endl;
            cout << "The matched string is: " << s1 << endl << endl;
        }
        else
        {
            cout << "The search string could not found in the source string." << endl << endl;
        }
    }
}

Solution 2

I felt bad for only hinting on backtracking and recursion in a comment. Here's an explanation:

Strategy:

Focus on the tokens between wilcards (the wildcards are not what should be matched).

  • extract first token from pattern
  • exit with success for no (more) tokens
  • for each token match in input
    • match the remainder of the pattern against the remainder of the input
    • if no successful submatch, fail, otherwise done

There is recursion (the matching of the remainder class match(....) recursively).

There is backtracking (if the recursive match doesn't succeed, we try the next token submatch)

Sample (see https://ideone.com/yApYp)

Only using loops and std::string interface (well, and iostreams for displaying test output) :)

#include <iostream>
#include <string>

typedef std::string::const_iterator It;

/*
 * Extract sequences of non-wildcard characters from pattern range
 */
std::string extract_token(It &s, It e) // [s,e) is (sub)pattern
{
    It wcard;
    for (wcard=s; wcard!=e; ++wcard)
        if ('?' == *wcard) break;

    std::string token(s,wcard);

    for (s=wcard; s!=e; ++s)
        if ('?' != *s) break; // treat '??' as '?' in pattern

    return token;
}

/*
 * Match a (sub)pattern against a (sub)input
 *
 * (See "Strategy" above)
 */
bool match(It patb, It pate, const std::string& input)
{
    while (patb != pate)
    {
        // get next token from pattern, advancing patb
        std::string token = extract_token(patb, pate); // updates patb

        if (!token.empty()) // could happen if pattern begins/ends with redundant '?'
        {
            size_t submatch = input.find(token);  // first submatch please

            while (std::string::npos != submatch)  // while we have a submatch
            {
                if (match(patb, pate, input.substr(token.size())))
                    return true; // match completed successfully

                // look for later potential submatches (*backtrack*)
                submatch = input.find(token, submatch+1);
            }
            return false; // required token not found
        }
    }
    return true; // no (remaining) pattern, always match
}

bool match(const std::string& pattern, const std::string& input)
{
    // just relay to overload more suited for recursion
    return match(pattern.begin(), pattern.end(), input); 
}

//////////////////////
// TEST PROGRAM

void test(const std::string& pattern, const std::string& input)
{
    std::cout << std::boolalpha;
    std::cout << "match(\"" << pattern << "\", \"" << input << "\") => " 
              << match(pattern, input) << std::endl;
}

int main()
{
    // matches
    test("?????",               "");
    test("?????",               "?????");
    test("",                    "");
    test("",                    "glorious");
    test("?r?o",                "glorious");
    test("some?words?exist",    "some silly words should, most definitely, be existing");
    test("some??words?exist?",  "some silly words should, most definitely, be existing");

    // failing matches
    test("_",                   "");
    test("_",                   "glorious");
    test("_",                   "glorious");
    test("glorious",            "glo?ious");
    test("?some??words?exist?", "bogus");
}

Solution 3

Probably you would best create two functions. One to check if a pattern matches a string at some given position, and another one that uses the first function to check all positions in the input string.

The function that checks for a matching pattern would loop over all characters in the pattern, and for each of those characters check if it is either ? or is identical to the character at the corresponding position in the input string.

Share:
11,343
hevele
Author by

hevele

Updated on August 02, 2022

Comments

  • hevele
    hevele almost 2 years

    I have an assignment in which I have to create a search pattern including wildcard character '?'. We haven't covered anything further than loops and properties of string libraries yet, so my teacher doesn't want me to use arrays or anything we haven't covered.

    My problem is to create an algorithm for the special character '?'. Do you have any idea how can I integrate it into my program without using more advanced tricks? Everything I tried is either completely wrong or has some mistakes in it.

    Program should request an input from the user for the source string and then, ask for another input for search string which can include '?' in it. For example:

    Source string: glorious Search string: ?r?o

    The matched string was found at index: 2 The matched string is: orio