Case insensitive 'Contains(string)'

1,078,634

Solution 1

To test if the string paragraph contains the string word (thanks @QuarterMeister)

culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0

Where culture is the instance of CultureInfo describing the language that the text is written in.

This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I and i for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.

Thus the strings tin and TIN are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)

To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture, because it will be wrong in familiar ways.

Solution 2

You could use the String.IndexOf Method and pass StringComparison.OrdinalIgnoreCase as the type of search to use:

string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;

Even better is defining a new extension method for string:

public static class StringExtensions
{
    public static bool Contains(this string source, string toCheck, StringComparison comp)
    {
        return source?.IndexOf(toCheck, comp) >= 0;
    }
}

Note, that null propagation ?. is available since C# 6.0 (VS 2015), for older versions use

if (source == null) return false;
return source.IndexOf(toCheck, comp) >= 0;

USAGE:

string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);

Solution 3

You can use IndexOf() like this:

string title = "STRING";

if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
    // The string exists in the original
}

Since 0 (zero) can be an index, you check against -1.

MSDN

The zero-based index position of value if that string is found, or -1 if it is not. If value is String.Empty, the return value is 0.

Solution 4

Alternative solution using Regex:

bool contains = Regex.IsMatch("StRiNG to search", Regex.Escape("string"), RegexOptions.IgnoreCase);

Solution 5

.NET Core 2.0+ (including .NET 5.0+)

.NET Core has had a pair of methods to deal with this since version 2.0 :

  • String.Contains(Char, StringComparison)
  • String.Contains(String, StringComparison)

Example:

"Test".Contains("test", System.StringComparison.CurrentCultureIgnoreCase);

It is now officially part of the .NET Standard 2.1, and therefore part of all the implementations of the Base Class Library that implement this version of the standard (or a higher one).

Share:
1,078,634
Boris Callens
Author by

Boris Callens

Senior .net programmer. Belgium(Antwerp) based. linked-in My real email is gmail.

Updated on April 11, 2022

Comments

  • Boris Callens
    Boris Callens about 2 years

    Is there a way to make the following return true?

    string title = "ASTRINGTOTEST";
    title.Contains("string");
    

    There doesn't seem to be an overload that allows me to set the case sensitivity. Currently I UPPERCASE them both, but that's just silly (by which I am referring to the i18n issues that come with up- and down casing).

    UPDATE

    This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.

    For most cases, in mono-lingual, English code bases this answer will suffice. I'm suspecting because most people coming here fall in this category this is the most popular answer.

    This answer however brings up the inherent problem that we can't compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that's why I marked it as such.

    • گلی
      گلی almost 3 years
      try this one: Yourculture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
  • Matt Hamilton
    Matt Hamilton over 15 years
    Interestingly, I've seen ToUpper() recommended over the use of ToLower() in this sort of scenario, because apparently ToLower() can "lose fidelity" in certain cultures - that is, two different upper-case characters translate to the same lower-case character.
  • Jon Skeet
    Jon Skeet over 15 years
    Search for "Turkey test" :)
  • Marc Stober
    Marc Stober over 15 years
    In some French locales, uppercase letters don't have the diacritics, so ToUpper() may not be any better than ToLower(). I'd say use the proper tools if they're available - case-insensitive compare.
  • Peter Gfader
    Peter Gfader almost 15 years
    Don't use ToUpper or ToLower, and do what Jon Skeet said
  • satyavrat
    satyavrat over 13 years
    Where would be the best place to put something like this within an app structure?
  • JaredPar
    JaredPar over 13 years
    @Andi, I usually have a couple of code files with general purpose extension methods where this would go.
  • Ed S.
    Ed S. over 13 years
    Just saw this again after two years and a new downvote... anyway, I agree that there are better ways to compare strings. However, not all programs will be localized (most won't) and many are internal or throwaway apps. Since I can hardly expect credit for advice best left for throwaway apps... I'm moving on :D
  • amurra
    amurra over 13 years
    If toCheck is the empty string it needs to return true per the Contains documentation: "true if the value parameter occurs within this string, or if value is the empty string (""); otherwise, false."
  • Mark Rolich
    Mark Rolich almost 13 years
    The questioner is looking for Contains not Compare.
  • Saravanan
    Saravanan over 12 years
    Good Idea, also we have a lot of bitwise combinations in RegexOptions like RegexOptions.IgnoreCase & RegexOptions.IgnorePatternWhitespace & RegexOptions.CultureInvariant; for anyone if helps.
  • user3018301
    user3018301 over 12 years
    Based on amurra's comment above, doesn't the suggested code need to be corrected? And shouldn't this be added to the accepted answer, so that the best response is first?
  • wonea
    wonea over 12 years
    Must say I prefer this method although using IsMatch for neatness.
  • cHao
    cHao over 12 years
    What's worse, since the search string is interpreted as a regex, a number of punctuation chars will cause incorrect results (or trigger an exception due to an invalid expression). Try searching for "." in "This is a sample string that doesn't contain the search string". Or try searching for "(invalid", for that matter.
  • Dan Mangiarelli
    Dan Mangiarelli over 12 years
    @cHao: In that case, Regex.Escape could help. Regex still seems unnecessary when IndexOf / extension Contains is simple (and arguably more clear).
  • Jed
    Jed over 12 years
    Note that I was not implying that this Regex solution was the best way to go. I was simply adding to the list of answers to the original posted question "Is there a way to make the following return true?".
  • wonea
    wonea over 12 years
    @cHao: thanks for highlighting that, illuminating. Still find this prefer this way for the inheritent power of Regexs.
  • VoodooChild
    VoodooChild over 12 years
    @JaredPar: I am curious to know if .Net 4 or .Net 4.5 has this is built in, would you know?
  • JaredPar
    JaredPar over 12 years
    @VoodooChild it doesn't appear so. Judging by the 4.0 API listing here msdn.microsoft.com/en-us/library/system.string_methods.aspx
  • Yogesh Pareek
    Yogesh Pareek over 12 years
    Doesn't seem to work within Entity Framework (4.x) queries. Probably because it's within the LINQ to SQL portion. I get the following error:LINQ to SQL does not recognize the method 'Boolean Contains(System.String, System.String, System.StringComparison)' method, and this method cannot be translated into a store expression. A sub-query performed after the results are returned works. I don't think it has anything to do with the extension as much as LINQ to SQL doesn't know how to translate the code to a SQL query.
  • Michael Bahig
    Michael Bahig over 12 years
    there is a reference site for such useful extension, to share the benefit. this is an entry very similar to this one there extensionmethod.net/Details.aspx?ID=473
  • vulcan raven
    vulcan raven over 11 years
    @DuckMaestro, the accepted answer is implementing Contains with IndexOf. So this approach is equally helpful! The C# code example on this page is using string.Compare(). SharePoint team's choice that is!
  • XåpplI'-I0llwlg'I  -
    XåpplI'-I0llwlg'I - over 11 years
    This will match against a pattern, though. In your example, if fileNamestr has any special regex characters (e.g. *, +, ., etc.) then you will be in for quite a surprise. The only way to make this solution work like a proper Contains function is to escape fileNamestr by doing Regex.Escape(fileNamestr).
  • Richard Pursehouse
    Richard Pursehouse over 11 years
    Great string extension method! I've edited mine to check the source string is not null to prevent any object reference errors from occurring when performing .IndexOf().
  • Timothy Walters
    Timothy Walters over 11 years
    @JookyDFW LINQ to SQL (and LINQ to Entities against a SQL database) will use string comparisons using the default collation in your DB, which in most cases is case-insensitive, so "STRING" and "string" match when hitting the DB, but don't match on the returned results without using this extension.
  • Colonel Panic
    Colonel Panic about 11 years
    This gives the same answer as paragraph.ToLower(culture).Contains(word.ToLower(culture)) with CultureInfo.InvariantCulture and it doesn't solve any localisation issues. Why over complicate things? stackoverflow.com/a/15464440/284795
  • Boris Callens
    Boris Callens about 11 years
    I see your point and you are probably right. This question is ancient and I need to read over it again, but I think I'll change the accepted answer.
  • Quartermeister
    Quartermeister about 11 years
    Why not culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0? That uses the right culture and is case-insensitive, it doesn't allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison.
  • Colonel Panic
    Colonel Panic about 11 years
    @Quartermeister that'll work. I tried to find out how CompareInfo.IndexOf defines case-insensitive comparison, but the method simply wraps InternalFindNLSStringEx which is undocumented.
  • JaredPar
    JaredPar about 11 years
    @ColonelPanic the ToLower version includes 2 allocations which are unnecessary in a comparison / search operation. Why needlessly allocate in a scenario that doesn't require it?
  • JaredPar
    JaredPar about 11 years
    This solution also needlessly pollutes the heap by allocating memory for what should be a searching function
  • Quartermeister
    Quartermeister about 11 years
    Comparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 "Greek Capital Letter Theta" or U+03F4 "Greek Capital Letter Theta Symbol" results in U+03B8, "Greek Small Letter Theta", but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 "Latin Small Letter S" and U+017F "Latin Small Letter Long S", so the IndexOf solution seems more consistent.
  • Quartermeister
    Quartermeister about 11 years
    There is MSDN documentation for FindNLSStringEx, and that probably applies to InternalFindNLSStringEx. The flag is NORM_IGNORECASE, which "ignores any tertiary distinction, whether or not it is actually linguistic case", but I don't know enough about NLS to know what a "tertiary distinction" is.
  • kutschkem
    kutschkem about 11 years
    To make this answer a little more complete, considering you want to do web mining: If you don't know the language of a page a-priori, you can quite easily figure out with a simple unigram-based language model. The only problem is getting the data for enough different languages - but probably there are libraries out there that can predict a page's language - i would guess this is a common enough problem.
  • Simon Mourier
    Simon Mourier about 11 years
    @Quartermeister - tertiary distinction is case distinction. Could be linguistic (turkish i -> İ) or not (ascii i -> I). A definition can be found on oracle site: docs.oracle.com/cd/B28359_01/server.111/b28298/… more on linguistic casing here: blogs.msdn.com/b/michkap/archive/2004/12/11/279942.aspx
  • Simon Mourier
    Simon Mourier about 11 years
    @Quartermeister - and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista).
  • Michael Freidgeim
    Michael Freidgeim about 11 years
    Should you move method with CompareOptions.IgnoreCase to be first, because .ToLower is obviously inefficient?
  • Jonathan Stark
    Jonathan Stark almost 11 years
    Now this will return true if source is an empty string or null no matter what toCheck is. That cannot be correct. Also IndexOf already returns true if toCheck is an empty string and source is not null. What is needed here is a check for null. I suggest if (source == null || value == null) return false;
  • Casey
    Casey over 10 years
    Your answer is exactly the same as guptat59's but, as was pointed out on his answer, this will match a regular expression, so if the string you're testing contains any special regex characters it will not yield the desired result.
  • Casey
    Casey about 10 years
    I would recommend, in the same spirit, adding CompareOptions.IgnoreKanaType and CompareOptions.IgnoreWidth.
  • hikalkan
    hikalkan almost 10 years
    This is not culture-specific and may fail for some cases. culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) should be used.
  • seebiscuit
    seebiscuit over 9 years
    What about LINQ IEnumerable.Contains which let's you specify a StringComparison option like StringComparison.CurrentCultureIgnoreCase? See MSDN msdn.microsoft.com/en-us/library/bb339118(v=vs.95).aspx
  • JaredPar
    JaredPar over 9 years
    @Seabiscuit that won't work because string is an IEnumerable<char> hence you can't use it to find substrings
  • ygoe
    ygoe almost 9 years
    Even if your app isn't localised, it may still run in an affected region, with the local thread culture in effect, and fail.
  • Ed S.
    Ed S. almost 9 years
    @LonelyPixel: I suppose my point was that this code shouldn't appear in anything but a throw-away type of application. In that case, I don't think you really care.
  • Chen
    Chen over 8 years
    Why didn't you write "ddddfg".IndexOf("Df", StringComparison.OrdinalIgnoreCase) ?
  • Boris Callens
    Boris Callens over 8 years
    @fiat This used to be the accepted answer, but is actually not really the correct answer. As long as you stay within the realm of standard Latin characters this looks like a trivial difference, but for large parts of the world it really isn't. The problem is I asked for a simple answer for what I now know is a really difficult topic and the answer really depends on the scenario. I'll update the original question to reflect that.
  • Liam
    Liam over 8 years
    This will be dramatically less efficient than using IndexOf. The use of a Regex will add considerably more processing time, memory, etc.
  • What Would Be Cool
    What Would Be Cool over 8 years
    Just as a style preference, I would just add an explicit method, rather than overload Contains: public static bool ContainsIgnoreCase(this string source, string toCheck) { return source.IndexOf(toCheck, StringComparison.OrdinalIgnoreCase) >= 0; }
  • Chris Marisic
    Chris Marisic over 8 years
    @CodeBlend most likely they tested all their internal stuff that is aware of their changes, and no one tested (or no one cared) that it broke all external links.
  • Jeppe Stig Nielsen
    Jeppe Stig Nielsen over 8 years
    A word of warning: The default for string.IndexOf(string) is to use the current culture, while the default for string.Contains(string) is to use the ordinal comparer. As we know, the former can be changed be picking a longer overload, while the latter cannot be changed. A consequence of this inconsistency is the following code sample: Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture; string self = "Waldstrasse"; string value = "straße"; Console.WriteLine(self.Contains(value));/* False */ Console.WriteLine(self.IndexOf(value) >= 0);/* True */
  • Liam
    Liam over 7 years
    Why avoid string.ToLower() when doing case-insensitive string comparisons? Tl;Dr It's costly because a new string is "manufactured".
  • Liam
    Liam over 7 years
    This is a straight up copy of this answer and suffers from the same issues as noted in that answer
  • Lucas
    Lucas over 7 years
    The source cant be null
  • James
    James about 7 years
    I just tried this. I'm searching for 154 string patterns in 40,000 files, ignoring lower case. It is extremely slow and took hours to run. Using Regex or ToLower might be more error prone for non-English searches, but they're both way faster.
  • BenKoshy
    BenKoshy about 7 years
    what if you know you're always gonna get an english string. which one to use?
  • Fabian Bigler
    Fabian Bigler about 7 years
    @BKSpurgeon I'd use OrdinalIgnoreCase, if case does not matter
  • ANeves
    ANeves almost 7 years
    Because this doesn't work in simple scenarios (".", "no dot here"), this is not an "alternate solution".
  • bernieslearnings
    bernieslearnings over 6 years
    Downvote for just being incorrect. What if title = StRiNg? StRiNg != string and StRiNg != STRING
  • Jar
    Jar over 6 years
    Agreed. Study regular expressions
  • O Thạnh Ldt
    O Thạnh Ldt about 6 years
    I was wrong. Edit answer as follows, too simple simple:<br/>title.ToLower().Contains("string") // of course "string" is lowercase
  • Kyle Delaney
    Kyle Delaney about 6 years
    if (string.IsNullOrEmpty(source)) return string.IsNullOrEmpty(toCheck);
  • Kyle Delaney
    Kyle Delaney about 6 years
    How could source be null? This is an extension method.
  • Iain
    Iain about 6 years
    @KyleDelaney extension method this can be null. I wish C# had a builtin null check there, but it doesn't.
  • JackAce
    JackAce almost 6 years
    Is searching for "Turkey test" the same as searching for "TURKEY TEST"?
  • Ed S.
    Ed S. almost 6 years
    @JackAce: depends on the application.
  • Tore Aurstad
    Tore Aurstad almost 6 years
    I did this same extension method, however I null checked instead on the string toCheck, if the coder passes in a null value, the IndexOf method throws a ArgumentNullException anyways. So the extension method could check both source and toCheck to be fault tolerant perhaps.
  • bytedev
    bytedev over 5 years
    So a empty string contains "Foo"... how is that true?
  • AFract
    AFract over 5 years
    tip : Instead of using current culture for cultureInfo value, you can also use CultureInfo.InvariantCulture
  • Boris Callens
    Boris Callens over 5 years
    Assuming your paragraph and word will always be in en-US
  • AndrewWhalan
    AndrewWhalan almost 5 years
    To avoid issues with forcing the culture to en-US, use return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0; instead.
  • Alex Gordon
    Alex Gordon almost 5 years
    i had to downvote this as well as 6 other people, this is simply wrong. should be return false; not return true;
  • Alex Gordon
    Alex Gordon almost 5 years
    why are you allowing ANOTHER layer of abstraction over StringComparison ?
  • Bedir
    Bedir over 4 years
    SIK and sik. Big difference.
  • phuclv
    phuclv over 4 years
    besides, parsing and matching a regex is much more resource-intensive than a simple case-insensitive comparison
  • Nyerguds
    Nyerguds over 4 years
    @Iain If you call myString.SomeMethod() and myString is null you well and truly deserve your null pointer exception. It's not the job of an extension method to check that.
  • Paweł Bulwan
    Paweł Bulwan about 4 years
    Now also available in .NET Standard 2.1
  • Iain
    Iain about 4 years
    @Nyerguds no, calling an extension method on a null pointer works. It surprised me when I first saw it, but :shrug:
  • Nyerguds
    Nyerguds about 4 years
    @Iain My point stands; even if that works, it's not the job of the method to check that, because extensions are supposed to act like normal functions executed on an object. That said, it is certainly useful to know that little quirk.
  • Nekuskus
    Nekuskus almost 4 years
    @Saravanan you confused AND & with OR |, it shod be RegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.CultureInvariant; instead.
  • Dariusz Woźniak
    Dariusz Woźniak over 3 years
    Available in .NET 5.0 as well.
  • Josh
    Josh over 3 years
    Because this simplifies both reading and writing the code. It's essentially mimicking what later versions of .Net added directly to the class. There's a lot to be said for simple convenience methods that make your life and the life of others easier, even if they do add a little bit of abstraction.
  • Surender Singh Malik
    Surender Singh Malik over 3 years
    yes it is available in .net Standard 2.1 and .Net Core 5.0 docs.microsoft.com/en-us/dotnet/api/… Got fixed as part of - github.com/dotnet/runtime/issues/22198
  • Ristogod
    Ristogod almost 3 years
    and it's slower than most other options
  • sofsntp
    sofsntp almost 3 years
    .NET 5.0 is included in ".NET Core 2.0+"
  • Steve
    Steve almost 3 years
    @Bedir for non-turkish speakers is any elaboration possible? google translate told me they translated to "STYLISH" and "stylish" respectively both showing "Translation verified by Google Translate contributors"
  • Bedir
    Bedir almost 3 years
  • Mike Christiansen
    Mike Christiansen over 2 years
    @Nyerguds - I often make extension methods specifically for null values. For me, having an extension method work on a null value is a /feature/. As a nice, simple example, I prefer to use an extension method instead of calling string.IsNullOrEmpty(). The benefit of this extension method is that I can call it on a null string. Without that benefit, I would have to use the null-conditional operator, followed by a null-coalescing operator (to coalesce the null value of the bool? to false)
  • Jeff
    Jeff over 2 years
    Why do we prefer ToUpperInvariant over ToLowerInvariant?
  • Jeff
    Jeff over 2 years
  • David Pierson
    David Pierson over 2 years
    Due to the various deficiencies listed, this is not an alternative solution. I invoke Jamie Zawinski's quote on Regex at this point.