Case insensitive 'Contains(string)'
Solution 1
To test if the string paragraph
contains the string word
(thanks @QuarterMeister)
culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
Where culture
is the instance of CultureInfo
describing the language that the text is written in.
This solution is transparent about the definition of case-insensitivity, which is language dependent. For example, the English language uses the characters I
and i
for the upper and lower case versions of the ninth letter, whereas the Turkish language uses these characters for the eleventh and twelfth letters of its 29 letter-long alphabet. The Turkish upper case version of 'i' is the unfamiliar character 'İ'.
Thus the strings tin
and TIN
are the same word in English, but different words in Turkish. As I understand, one means 'spirit' and the other is an onomatopoeia word. (Turks, please correct me if I'm wrong, or suggest a better example)
To summarise, you can only answer the question 'are these two strings the same but in different cases' if you know what language the text is in. If you don't know, you'll have to take a punt. Given English's hegemony in software, you should probably resort to CultureInfo.InvariantCulture
, because it will be wrong in familiar ways.
Solution 2
You could use the String.IndexOf
Method and pass StringComparison.OrdinalIgnoreCase
as the type of search to use:
string title = "STRING";
bool contains = title.IndexOf("string", StringComparison.OrdinalIgnoreCase) >= 0;
Even better is defining a new extension method for string:
public static class StringExtensions
{
public static bool Contains(this string source, string toCheck, StringComparison comp)
{
return source?.IndexOf(toCheck, comp) >= 0;
}
}
Note, that null propagation ?.
is available since C# 6.0 (VS 2015), for older versions use
if (source == null) return false;
return source.IndexOf(toCheck, comp) >= 0;
USAGE:
string title = "STRING";
bool contains = title.Contains("string", StringComparison.OrdinalIgnoreCase);
Solution 3
You can use IndexOf()
like this:
string title = "STRING";
if (title.IndexOf("string", 0, StringComparison.CurrentCultureIgnoreCase) != -1)
{
// The string exists in the original
}
Since 0 (zero) can be an index, you check against -1.
The zero-based index position of value if that string is found, or -1 if it is not. If value is String.Empty, the return value is 0.
Solution 4
Alternative solution using Regex:
bool contains = Regex.IsMatch("StRiNG to search", Regex.Escape("string"), RegexOptions.IgnoreCase);
Solution 5
.NET Core 2.0+ (including .NET 5.0+)
.NET Core has had a pair of methods to deal with this since version 2.0 :
- String.Contains(Char, StringComparison)
- String.Contains(String, StringComparison)
Example:
"Test".Contains("test", System.StringComparison.CurrentCultureIgnoreCase);
It is now officially part of the .NET Standard 2.1, and therefore part of all the implementations of the Base Class Library that implement this version of the standard (or a higher one).
Boris Callens
Senior .net programmer. Belgium(Antwerp) based. linked-in My real email is gmail.
Updated on April 11, 2022Comments
-
Boris Callens about 2 years
Is there a way to make the following return true?
string title = "ASTRINGTOTEST"; title.Contains("string");
There doesn't seem to be an overload that allows me to set the case sensitivity. Currently I UPPERCASE them both, but that's just silly (by which I am referring to the i18n issues that come with up- and down casing).
UPDATE
This question is ancient and since then I have realized I asked for a simple answer for a really vast and difficult topic if you care to investigate it fully.
For most cases, in mono-lingual, English code bases this answer will suffice. I'm suspecting because most people coming here fall in this category this is the most popular answer.
This answer however brings up the inherent problem that we can't compare text case insensitive until we know both texts are the same culture and we know what that culture is. This is maybe a less popular answer, but I think it is more correct and that's why I marked it as such.
-
گلی almost 3 yearstry this one: Yourculture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
-
-
Matt Hamilton over 15 yearsInterestingly, I've seen ToUpper() recommended over the use of ToLower() in this sort of scenario, because apparently ToLower() can "lose fidelity" in certain cultures - that is, two different upper-case characters translate to the same lower-case character.
-
Jon Skeet over 15 yearsSearch for "Turkey test" :)
-
Marc Stober over 15 yearsIn some French locales, uppercase letters don't have the diacritics, so ToUpper() may not be any better than ToLower(). I'd say use the proper tools if they're available - case-insensitive compare.
-
Peter Gfader almost 15 yearsDon't use ToUpper or ToLower, and do what Jon Skeet said
-
satyavrat over 13 yearsWhere would be the best place to put something like this within an app structure?
-
JaredPar over 13 years@Andi, I usually have a couple of code files with general purpose extension methods where this would go.
-
Ed S. over 13 yearsJust saw this again after two years and a new downvote... anyway, I agree that there are better ways to compare strings. However, not all programs will be localized (most won't) and many are internal or throwaway apps. Since I can hardly expect credit for advice best left for throwaway apps... I'm moving on :D
-
amurra over 13 yearsIf toCheck is the empty string it needs to return true per the Contains documentation: "true if the value parameter occurs within this string, or if value is the empty string (""); otherwise, false."
-
Mark Rolich almost 13 yearsThe questioner is looking for
Contains
notCompare
. -
Saravanan over 12 yearsGood Idea, also we have a lot of bitwise combinations in RegexOptions like
RegexOptions.IgnoreCase & RegexOptions.IgnorePatternWhitespace & RegexOptions.CultureInvariant;
for anyone if helps. -
user3018301 over 12 yearsBased on amurra's comment above, doesn't the suggested code need to be corrected? And shouldn't this be added to the accepted answer, so that the best response is first?
-
wonea over 12 yearsMust say I prefer this method although using IsMatch for neatness.
-
cHao over 12 yearsWhat's worse, since the search string is interpreted as a regex, a number of punctuation chars will cause incorrect results (or trigger an exception due to an invalid expression). Try searching for
"."
in"This is a sample string that doesn't contain the search string"
. Or try searching for"(invalid"
, for that matter. -
Dan Mangiarelli over 12 years@cHao: In that case,
Regex.Escape
could help. Regex still seems unnecessary whenIndexOf
/ extensionContains
is simple (and arguably more clear). -
Jed over 12 yearsNote that I was not implying that this Regex solution was the best way to go. I was simply adding to the list of answers to the original posted question "Is there a way to make the following return true?".
-
wonea over 12 years@cHao: thanks for highlighting that, illuminating. Still find this prefer this way for the inheritent power of Regexs.
-
VoodooChild over 12 years@JaredPar: I am curious to know if .Net 4 or .Net 4.5 has this is built in, would you know?
-
JaredPar over 12 years@VoodooChild it doesn't appear so. Judging by the 4.0 API listing here msdn.microsoft.com/en-us/library/system.string_methods.aspx
-
Yogesh Pareek over 12 yearsDoesn't seem to work within Entity Framework (4.x) queries. Probably because it's within the LINQ to SQL portion. I get the following error:LINQ to SQL does not recognize the method 'Boolean Contains(System.String, System.String, System.StringComparison)' method, and this method cannot be translated into a store expression. A sub-query performed after the results are returned works. I don't think it has anything to do with the extension as much as LINQ to SQL doesn't know how to translate the code to a SQL query.
-
Michael Bahig over 12 yearsthere is a reference site for such useful extension, to share the benefit. this is an entry very similar to this one there extensionmethod.net/Details.aspx?ID=473
-
vulcan raven over 11 years@DuckMaestro, the accepted answer is implementing
Contains
withIndexOf
. So this approach is equally helpful! The C# code example on this page is using string.Compare(). SharePoint team's choice that is! -
XåpplI'-I0llwlg'I - over 11 yearsThis will match against a pattern, though. In your example, if
fileNamestr
has any special regex characters (e.g.*
,+
,.
, etc.) then you will be in for quite a surprise. The only way to make this solution work like a properContains
function is to escapefileNamestr
by doingRegex.Escape(fileNamestr)
. -
Richard Pursehouse over 11 yearsGreat string extension method! I've edited mine to check the source string is not null to prevent any object reference errors from occurring when performing .IndexOf().
-
Timothy Walters over 11 years@JookyDFW LINQ to SQL (and LINQ to Entities against a SQL database) will use string comparisons using the default collation in your DB, which in most cases is case-insensitive, so "STRING" and "string" match when hitting the DB, but don't match on the returned results without using this extension.
-
Colonel Panic about 11 yearsThis gives the same answer as
paragraph.ToLower(culture).Contains(word.ToLower(culture))
withCultureInfo.InvariantCulture
and it doesn't solve any localisation issues. Why over complicate things? stackoverflow.com/a/15464440/284795 -
Boris Callens about 11 yearsI see your point and you are probably right. This question is ancient and I need to read over it again, but I think I'll change the accepted answer.
-
Quartermeister about 11 yearsWhy not
culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0
? That uses the right culture and is case-insensitive, it doesn't allocate temporary lowercase strings, and it avoids the question of whether converting to lowercase and comparing is always the same as a case-insensitive comparison. -
Colonel Panic about 11 years@Quartermeister that'll work. I tried to find out how
CompareInfo.IndexOf
defines case-insensitive comparison, but the method simply wrapsInternalFindNLSStringEx
which is undocumented. -
JaredPar about 11 years@ColonelPanic the
ToLower
version includes 2 allocations which are unnecessary in a comparison / search operation. Why needlessly allocate in a scenario that doesn't require it? -
JaredPar about 11 yearsThis solution also needlessly pollutes the heap by allocating memory for what should be a searching function
-
Quartermeister about 11 yearsComparing with ToLower() will give different results from a case-insensitive IndexOf when two different letters have the same lowercase letter. For example, calling ToLower() on either U+0398 "Greek Capital Letter Theta" or U+03F4 "Greek Capital Letter Theta Symbol" results in U+03B8, "Greek Small Letter Theta", but the capital letters are considered different. Both solutions consider lowercase letters with the same capital letter different, such as U+0073 "Latin Small Letter S" and U+017F "Latin Small Letter Long S", so the IndexOf solution seems more consistent.
-
Quartermeister about 11 yearsThere is MSDN documentation for FindNLSStringEx, and that probably applies to InternalFindNLSStringEx. The flag is NORM_IGNORECASE, which "ignores any tertiary distinction, whether or not it is actually linguistic case", but I don't know enough about NLS to know what a "tertiary distinction" is.
-
kutschkem about 11 yearsTo make this answer a little more complete, considering you want to do web mining: If you don't know the language of a page a-priori, you can quite easily figure out with a simple unigram-based language model. The only problem is getting the data for enough different languages - but probably there are libraries out there that can predict a page's language - i would guess this is a common enough problem.
-
Simon Mourier about 11 years@Quartermeister - tertiary distinction is case distinction. Could be linguistic (turkish i -> İ) or not (ascii i -> I). A definition can be found on oracle site: docs.oracle.com/cd/B28359_01/server.111/b28298/… more on linguistic casing here: blogs.msdn.com/b/michkap/archive/2004/12/11/279942.aspx
-
Simon Mourier about 11 years@Quartermeister - and BTW, I believe .NET 2 and .NET4 behave differently on this as .NET 4 always uses NORM_LINGUISTIC_CASING while .NET 2 did not (this flags has appeared with Windows Vista).
-
Michael Freidgeim about 11 yearsShould you move method with CompareOptions.IgnoreCase to be first, because .ToLower is obviously inefficient?
-
Jonathan Stark almost 11 yearsNow this will return true if source is an empty string or null no matter what toCheck is. That cannot be correct. Also IndexOf already returns true if toCheck is an empty string and source is not null. What is needed here is a check for null. I suggest if (source == null || value == null) return false;
-
Casey over 10 yearsYour answer is exactly the same as guptat59's but, as was pointed out on his answer, this will match a regular expression, so if the string you're testing contains any special regex characters it will not yield the desired result.
-
Casey about 10 yearsI would recommend, in the same spirit, adding CompareOptions.IgnoreKanaType and CompareOptions.IgnoreWidth.
-
hikalkan almost 10 yearsThis is not culture-specific and may fail for some cases. culture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) should be used.
-
seebiscuit over 9 yearsWhat about LINQ
IEnumerable.Contains
which let's you specify aStringComparison
option likeStringComparison.CurrentCultureIgnoreCase
? See MSDN msdn.microsoft.com/en-us/library/bb339118(v=vs.95).aspx -
JaredPar over 9 years@Seabiscuit that won't work because
string
is anIEnumerable<char>
hence you can't use it to find substrings -
ygoe almost 9 yearsEven if your app isn't localised, it may still run in an affected region, with the local thread culture in effect, and fail.
-
Ed S. almost 9 years@LonelyPixel: I suppose my point was that this code shouldn't appear in anything but a throw-away type of application. In that case, I don't think you really care.
-
Chen over 8 yearsWhy didn't you write "ddddfg".IndexOf("Df", StringComparison.OrdinalIgnoreCase) ?
-
Boris Callens over 8 years@fiat This used to be the accepted answer, but is actually not really the correct answer. As long as you stay within the realm of standard Latin characters this looks like a trivial difference, but for large parts of the world it really isn't. The problem is I asked for a simple answer for what I now know is a really difficult topic and the answer really depends on the scenario. I'll update the original question to reflect that.
-
Liam over 8 yearsThis will be dramatically less efficient than using IndexOf. The use of a Regex will add considerably more processing time, memory, etc.
-
What Would Be Cool over 8 yearsJust as a style preference, I would just add an explicit method, rather than overload Contains: public static bool ContainsIgnoreCase(this string source, string toCheck) { return source.IndexOf(toCheck, StringComparison.OrdinalIgnoreCase) >= 0; }
-
Chris Marisic over 8 years@CodeBlend most likely they tested all their internal stuff that is aware of their changes, and no one tested (or no one cared) that it broke all external links.
-
Jeppe Stig Nielsen over 8 yearsA word of warning: The default for
string.IndexOf(string)
is to use the current culture, while the default forstring.Contains(string)
is to use the ordinal comparer. As we know, the former can be changed be picking a longer overload, while the latter cannot be changed. A consequence of this inconsistency is the following code sample:Thread.CurrentThread.CurrentCulture = CultureInfo.InvariantCulture; string self = "Waldstrasse"; string value = "straße"; Console.WriteLine(self.Contains(value));/* False */ Console.WriteLine(self.IndexOf(value) >= 0);/* True */
-
Liam over 7 yearsWhy avoid string.ToLower() when doing case-insensitive string comparisons? Tl;Dr It's costly because a new string is "manufactured".
-
Liam over 7 yearsThis is a straight up copy of this answer and suffers from the same issues as noted in that answer
-
Lucas over 7 yearsThe source cant be null
-
James about 7 yearsI just tried this. I'm searching for 154 string patterns in 40,000 files, ignoring lower case. It is extremely slow and took hours to run. Using Regex or ToLower might be more error prone for non-English searches, but they're both way faster.
-
BenKoshy about 7 yearswhat if you know you're always gonna get an english string. which one to use?
-
Fabian Bigler about 7 years@BKSpurgeon I'd use OrdinalIgnoreCase, if case does not matter
-
ANeves almost 7 yearsBecause this doesn't work in simple scenarios (".", "no dot here"), this is not an "alternate solution".
-
bernieslearnings over 6 yearsDownvote for just being incorrect. What if title = StRiNg? StRiNg != string and StRiNg != STRING
-
Jar over 6 yearsAgreed. Study regular expressions
-
O Thạnh Ldt about 6 yearsI was wrong. Edit answer as follows, too simple simple:<br/>title.ToLower().Contains("string") // of course "string" is lowercase
-
Kyle Delaney about 6 years
if (string.IsNullOrEmpty(source)) return string.IsNullOrEmpty(toCheck);
-
Kyle Delaney about 6 yearsHow could source be null? This is an extension method.
-
Iain about 6 years@KyleDelaney extension method
this
can be null. I wish C# had a builtin null check there, but it doesn't. -
JackAce almost 6 yearsIs searching for "Turkey test" the same as searching for "TURKEY TEST"?
-
Ed S. almost 6 years@JackAce: depends on the application.
-
Tore Aurstad almost 6 yearsI did this same extension method, however I null checked instead on the string toCheck, if the coder passes in a null value, the IndexOf method throws a ArgumentNullException anyways. So the extension method could check both source and toCheck to be fault tolerant perhaps.
-
bytedev over 5 yearsSo a empty string contains "Foo"... how is that true?
-
AFract over 5 yearstip : Instead of using current culture for cultureInfo value, you can also use CultureInfo.InvariantCulture
-
Boris Callens over 5 yearsAssuming your paragraph and word will always be in en-US
-
AndrewWhalan almost 5 yearsTo avoid issues with forcing the culture to en-US, use
return CultureInfo.CurrentCulture.CompareInfo.IndexOf(paragraph, word, CompareOptions.IgnoreCase) >= 0;
instead. -
Alex Gordon almost 5 yearsi had to downvote this as well as 6 other people, this is simply wrong. should be
return false;
notreturn true;
-
Alex Gordon almost 5 yearswhy are you allowing ANOTHER layer of abstraction over
StringComparison
? -
Bedir over 4 yearsSIK and sik. Big difference.
-
phuclv over 4 yearsbesides, parsing and matching a regex is much more resource-intensive than a simple case-insensitive comparison
-
Nyerguds over 4 years@Iain If you call
myString.SomeMethod()
andmyString
is null you well and truly deserve your null pointer exception. It's not the job of an extension method to check that. -
Paweł Bulwan about 4 yearsNow also available in .NET Standard 2.1
-
Iain about 4 years@Nyerguds no, calling an extension method on a null pointer works. It surprised me when I first saw it, but :shrug:
-
Nyerguds about 4 years@Iain My point stands; even if that works, it's not the job of the method to check that, because extensions are supposed to act like normal functions executed on an object. That said, it is certainly useful to know that little quirk.
-
Nekuskus almost 4 years@Saravanan you confused AND
&
with OR|
, it shod beRegexOptions.IgnoreCase | RegexOptions.IgnorePatternWhitespace | RegexOptions.CultureInvariant;
instead. -
Dariusz Woźniak over 3 yearsAvailable in .NET 5.0 as well.
-
Josh over 3 yearsBecause this simplifies both reading and writing the code. It's essentially mimicking what later versions of .Net added directly to the class. There's a lot to be said for simple convenience methods that make your life and the life of others easier, even if they do add a little bit of abstraction.
-
Surender Singh Malik over 3 yearsyes it is available in .net Standard 2.1 and .Net Core 5.0 docs.microsoft.com/en-us/dotnet/api/… Got fixed as part of - github.com/dotnet/runtime/issues/22198
-
Ristogod almost 3 yearsand it's slower than most other options
-
sofsntp almost 3 years.NET 5.0 is included in ".NET Core 2.0+"
-
Steve almost 3 years@Bedir for non-turkish speakers is any elaboration possible? google translate told me they translated to "STYLISH" and "stylish" respectively both showing "Translation verified by Google Translate contributors"
-
Bedir almost 3 years
-
Mike Christiansen over 2 years@Nyerguds - I often make extension methods specifically for null values. For me, having an extension method work on a null value is a /feature/. As a nice, simple example, I prefer to use an extension method instead of calling
string.IsNullOrEmpty()
. The benefit of this extension method is that I can call it on a null string. Without that benefit, I would have to use the null-conditional operator, followed by a null-coalescing operator (to coalesce thenull
value of thebool?
tofalse
) -
Jeff over 2 yearsWhy do we prefer ToUpperInvariant over ToLowerInvariant?
-
Jeff over 2 yearsnevermind found out why docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/…
-
David Pierson over 2 yearsDue to the various deficiencies listed, this is not an alternative solution. I invoke Jamie Zawinski's quote on Regex at this point.