Case insensitive std::string.find()

117,288

Solution 1

You could use std::search with a custom predicate.

#include <locale>
#include <iostream>
#include <algorithm>
using namespace std;

// templated version of my_equal so it could work with both char and wchar_t
template<typename charT>
struct my_equal {
    my_equal( const std::locale& loc ) : loc_(loc) {}
    bool operator()(charT ch1, charT ch2) {
        return std::toupper(ch1, loc_) == std::toupper(ch2, loc_);
    }
private:
    const std::locale& loc_;
};

// find substring (case insensitive)
template<typename T>
int ci_find_substr( const T& str1, const T& str2, const std::locale& loc = std::locale() )
{
    typename T::const_iterator it = std::search( str1.begin(), str1.end(), 
        str2.begin(), str2.end(), my_equal<typename T::value_type>(loc) );
    if ( it != str1.end() ) return it - str1.begin();
    else return -1; // not found
}

int main(int arc, char *argv[]) 
{
    // string test
    std::string str1 = "FIRST HELLO";
    std::string str2 = "hello";
    int f1 = ci_find_substr( str1, str2 );

    // wstring test
    std::wstring wstr1 = L"ОПЯТЬ ПРИВЕТ";
    std::wstring wstr2 = L"привет";
    int f2 = ci_find_substr( wstr1, wstr2 );

    return 0;
}

Solution 2

The new C++11 style:

#include <algorithm>
#include <string>
#include <cctype>

/// Try to find in the Haystack the Needle - ignore case
bool findStringIC(const std::string & strHaystack, const std::string & strNeedle)
{
  auto it = std::search(
    strHaystack.begin(), strHaystack.end(),
    strNeedle.begin(),   strNeedle.end(),
    [](char ch1, char ch2) { return std::toupper(ch1) == std::toupper(ch2); }
  );
  return (it != strHaystack.end() );
}

Explanation of the std::search can be found on cplusplus.com.

Solution 3

why not use Boost.StringAlgo:

#include <boost/algorithm/string/find.hpp>

bool Foo()
{
   //case insensitive find

   std::string str("Hello");

   boost::iterator_range<std::string::const_iterator> rng;

   rng = boost::ifind_first(str, std::string("EL"));

   return rng;
}

Solution 4

Why not just convert both strings to lowercase before you call find()?

tolower

Notice:

Solution 5

Since you're doing substring searches (std::string) and not element (character) searches, there's unfortunately no existing solution I'm aware of that's immediately accessible in the standard library to do this.

Nevertheless, it's easy enough to do: simply convert both strings to upper case (or both to lower case - I chose upper in this example).

std::string upper_string(const std::string& str)
{
    string upper;
    transform(str.begin(), str.end(), std::back_inserter(upper), toupper);
    return upper;
}

std::string::size_type find_str_ci(const std::string& str, const std::string& substr)
{
    return upper(str).find(upper(substr) );
}

This is not a fast solution (bordering into pessimization territory) but it's the only one I know of off-hand. It's also not that hard to implement your own case-insensitive substring finder if you are worried about efficiency.

Additionally, I need to support std::wstring/wchar_t. Any ideas?

tolower/toupper in locale will work on wide-strings as well, so the solution above should be just as applicable (simple change std::string to std::wstring).

[Edit] An alternative, as pointed out, is to adapt your own case-insensitive string type from basic_string by specifying your own character traits. This works if you can accept all string searches, comparisons, etc. to be case-insensitive for a given string type.

Share:
117,288

Related videos on Youtube

wpfwannabe
Author by

wpfwannabe

Updated on July 05, 2022

Comments

  • wpfwannabe
    wpfwannabe almost 2 years

    I am using std::string's find() method to test if a string is a substring of another. Now I need case insensitive version of the same thing. For string comparison I can always turn to stricmp() but there doesn't seem to be a stristr().

    I have found various answers and most suggest using Boost which is not an option in my case. Additionally, I need to support std::wstring/wchar_t. Any ideas?

    • Alexandre C.
      Alexandre C. almost 14 years
      There's a Gotw about this very subject : gotw.ca/gotw/029.htm
    • Nasir
      Nasir over 8 years
      stristr is not there, but "char *strcasestr(const char *haystack, const char *needle);" is there. Isnt this ok?
    • Yuchen
      Yuchen almost 8 years
      @Nasir, strcasestr is not available under Windows.
  • bkausbk
    bkausbk almost 12 years
    Because it is very inefficient for larger strings.
  • rstackhouse
    rstackhouse about 10 years
    Why are you using templates here?
  • Kirill V. Lyadvinsky
    Kirill V. Lyadvinsky about 10 years
    @rstackhouse, template here is for a support of different char types (char & wchar_t).
  • Lara
    Lara almost 10 years
    Thanks, Kirill. For those as clueless as I am, insert std::advance( it, offset ); after the declaration of the iterator to start the search from an offset.
  • jww
    jww almost 10 years
    "... I have found various answers and most suggest using Boost which is not an option in my case".
  • Enissay
    Enissay over 9 years
    What if I want to find a char c in a string str using the same function. calling it using findStringIC(str, (string)c) doesnt work
  • CC.
    CC. over 9 years
    This type of char to string cast does not work, you have to actually create the string object like std::string(1, 'x') See coliru.stacked-crooked.com/a/af4051dd1d15972e If you do this a lot it might worth creating a specific function that does not require creating a new object every time.
  • Bart
    Bart about 8 years
    This is also not really a good idea if your software ever needs to be localized. See Turkey test: haacked.com/archive/2012/07/05/…
  • Alexis Wilke
    Alexis Wilke almost 8 years
    In most cases, it is preferable to use tolower() when doing a case insensitive search. Even Ada changed it to lowercase! There are reasons that Unicode.org probably explains somewhere but I do not know exactly why.
  • CC.
    CC. over 7 years
    Upper case is better msdn.microsoft.com/en-us/library/bb386042.aspx but of course not perfect. If you need Turkish, that's going to be hard stackoverflow.com/questions/234591/upper-vs-lower-case and haacked.com/archive/2012/07/05/…
  • kayleeFrye_onDeck
    kayleeFrye_onDeck about 7 years
    Typically, unless a C++ question is tagged for Boost, it's assumed Boost isn't an option.
  • Basj
    Basj almost 7 years
    For those (like me) who are not familiar with templates, can you also post a standard version without templates, without locales? Just for wstring for example @KirillV.Lyadvinsky?
  • Orwellophile
    Orwellophile over 6 years
    ... did they do away with templates in C++11? I must have missed the memo :)
  • CC.
    CC. over 6 years
    No template needed in this case. For C++17 you might want to take a look at string_view instead of std::string skebanga.github.io/string-view
  • kayleeFrye_onDeck
    kayleeFrye_onDeck about 6 years
    The arguments you'll uncover for doing basic upcase and downcase operations in C++ on anything not encoded as ANSI will overwhelm you xD Simply put, it's not trivial for the standard library to handle as of C++17.
  • kayleeFrye_onDeck
    kayleeFrye_onDeck about 6 years
    That was a great read on string_view! Something new and shiny, and fast! :)
  • SJHowe
    SJHowe almost 6 years
    The last 3 lines of code should be return (extracted_match.length() == subString.length());
  • kayleeFrye_onDeck
    kayleeFrye_onDeck almost 6 years
    "should" might be a bit strong for wording, but I agree that it's an improvement! :) Ty & updated ^_^
  • MiloDC
    MiloDC almost 4 years
    Does the call to std::toupper actually work for wide characters? Wouldn't you need to call std::towupper?
  • Nguyen Manh
    Nguyen Manh over 2 years
    please add string.find_first_of or wstring.find_first_of . implementation
  • ycomp
    ycomp about 2 years
    the answer i was looking for, thanks