Is there an alternative to string.Replace that is case-insensitive?

126,306

Solution 1

From MSDN
$0 - "Substitutes the last substring matched by group number number (decimal)."

In .NET Regular expressions group 0 is always the entire match. For a literal $ you need to

string value = Regex.Replace("%PolicyAmount%", "%PolicyAmount%", @"$$0", RegexOptions.IgnoreCase);

Solution 2

Seems like string.Replace should have an overload that takes a StringComparison argument. Since it doesn't, you could try something like this:

public static string ReplaceString(string str, string oldValue, string newValue, StringComparison comparison)
{
    StringBuilder sb = new StringBuilder();

    int previousIndex = 0;
    int index = str.IndexOf(oldValue, comparison);
    while (index != -1)
    {
        sb.Append(str.Substring(previousIndex, index - previousIndex));
        sb.Append(newValue);
        index += oldValue.Length;

        previousIndex = index;
        index = str.IndexOf(oldValue, index, comparison);
    }
    sb.Append(str.Substring(previousIndex));

    return sb.ToString();
}

Solution 3

Kind of a confusing group of answers, in part because the title of the question is actually much larger than the specific question being asked. After reading through, I'm not sure any answer is a few edits away from assimilating all the good stuff here, so I figured I'd try to sum.

Here's an extension method that I think avoids the pitfalls mentioned here and provides the most broadly applicable solution.

public static string ReplaceCaseInsensitiveFind(this string str, string findMe,
    string newValue)
{
    return Regex.Replace(str,
        Regex.Escape(findMe),
        Regex.Replace(newValue, "\\$[0-9]+", @"$$$0"),
        RegexOptions.IgnoreCase);
}

So...

  • This is an extension method @MarkRobinson
  • This doesn't try to skip Regex @Helge (you really have to do byte-by-byte if you want to string sniff like this outside of Regex)
  • Passes @MichaelLiu 's excellent test case, "œ".ReplaceCaseInsensitiveFind("oe", ""), though he may have had a slightly different behavior in mind.

Unfortunately, @HA 's comment that you have to Escape all three isn't correct. The initial value and newValue doesn't need to be.

Note: You do, however, have to escape $s in the new value that you're inserting if they're part of what would appear to be a "captured value" marker. Thus the three dollar signs in the Regex.Replace inside the Regex.Replace [sic]. Without that, something like this breaks...

"This is HIS fork, hIs spoon, hissssssss knife.".ReplaceCaseInsensitiveFind("his", @"he$0r")

Here's the error:

An unhandled exception of type 'System.ArgumentException' occurred in System.dll

Additional information: parsing "The\hisr\ is\ he\HISr\ fork,\ he\hIsr\ spoon,\ he\hisrsssssss\ knife\." - Unrecognized escape sequence \h.

Tell you what, I know folks that are comfortable with Regex feel like their use avoids errors, but I'm often still partial to byte sniffing strings (but only after having read Spolsky on encodings) to be absolutely sure you're getting what you intended for important use cases. Reminds me of Crockford on "insecure regular expressions" a little. Too often we write regexps that allow what we want (if we're lucky), but unintentionally allow more in (eg, Is $10 really a valid "capture value" string in my newValue regexp, above?) because we weren't thoughtful enough. Both methods have value, and both encourage different types of unintentional errors. It's often easy to underestimate complexity.

That weird $ escaping (and that Regex.Escape didn't escape captured value patterns like $0 as I would have expected in replacement values) drove me mad for a while. Programming Is Hard (c) 1842

Solution 4

Seems the easiest method is simply to use the Replace method that ships with .Net and has been around since .Net 1.0:

string res = Microsoft.VisualBasic.Strings.Replace(res, 
                                   "%PolicyAmount%", 
                                   "$0", 
                                   Compare: Microsoft.VisualBasic.CompareMethod.Text);

In order to use this method, you have to add a Reference to the Microsoft.VisualBasic assemblly. This assembly is a standard part of the .Net runtime, it is not an extra download or marked as obsolete.

Solution 5

Here's an extension method. Not sure where I found it.

public static class StringExtensions
{
    public static string Replace(this string originalString, string oldValue, string newValue, StringComparison comparisonType)
    {
        int startIndex = 0;
        while (true)
        {
            startIndex = originalString.IndexOf(oldValue, startIndex, comparisonType);
            if (startIndex == -1)
                break;

            originalString = originalString.Substring(0, startIndex) + newValue + originalString.Substring(startIndex + oldValue.Length);

            startIndex += newValue.Length;
        }

        return originalString;
    }

}
Share:
126,306

Related videos on Youtube

Aheho
Author by

Aheho

I have experience with asp.net, delphi, MS-Sql Server, and Visual Basic v6.

Updated on August 09, 2020

Comments

  • Aheho
    Aheho almost 4 years

    I need to search a string and replace all occurrences of %FirstName% and %PolicyAmount% with a value pulled from a database. The problem is the capitalization of FirstName varies. That prevents me from using the String.Replace() method. I've seen web pages on the subject that suggest

    Regex.Replace(strInput, strToken, strReplaceWith, RegexOptions.IgnoreCase);
    

    However for some reason when I try and replace %PolicyAmount% with $0, the replacement never takes place. I assume that it has something to do with the dollar sign being a reserved character in regex.

    Is there another method I can use that doesn't involve sanitizing the input to deal with regex special characters?

    • cfeduke
      cfeduke over 15 years
      If "$0" is the variable going in that doesn't impact the regex at all.
    • ruffin
      ruffin over 2 years
      As Markus points out, it appears "modern" versions of .NET now have this baked in with good ole StringComparison.OrdinalIgnoreCase as a third parameter.
  • cfeduke
    cfeduke over 15 years
    By reverse, I mean process the found locations in reverse from furthest to shortest, not traverse the string from the database in reverse.
  • Aheho
    Aheho over 15 years
    This doesn't work. The $ is not in the token. It's in the strReplace With string.
  • Joel Coehoorn
    Joel Coehoorn over 15 years
    And you can't adapt it for that?
  • Aheho
    Aheho over 15 years
    This site is supposed to be a repository for correct answers. Not answers that are almost correct.
  • Muhammad Hafizh
    Muhammad Hafizh about 15 years
    Extension methods only work in 3+ right? +1 All the same, since the OP wasn't specific, but you may want to mention it
  • dhara tcrails
    dhara tcrails about 14 years
    Also, this will be faster than the regex.
  • AMissico
    AMissico almost 14 years
    Nice. I would change ReplaceString to Replace.
  • David Guerrero
    David Guerrero over 13 years
    in this particular case this is fine, but in cases where the strings are input from outside, one cannot be sure that they do not contain characters which mean something special in regular expressions
  • Mark Robinson
    Mark Robinson over 13 years
    Agree with the comments above. This can be made into an extension method with the same method name. Just pop it in a static class with the method signature: public static string Replace(this String str, string oldValue, string newValue, StringComparison comparison)
  • Helge Klein
    Helge Klein about 13 years
    Speed is not everything. Use the regex instead of doing it yourself, introducing additional complexity and potentially also bugs. Additionally the regex solution is much easier to read and maintain.
  • Helge Klein
    Helge Klein about 13 years
    You should escape special characters like this: string value = Regex.Replace("%PolicyAmount%", Regex.Escape("%PolicyAmount%"), Regex.Escape("$0"), RegexOptions.IgnoreCase);
  • Paolo Tedesco
    Paolo Tedesco about 13 years
    Actually regex-escaping the second string will have no effect apart getting an extra \ before the replacement. To ignore special characters in the replacement string, you'd better write a matchevaluator that returns the string itself.
  • Jim
    Jim about 13 years
    @Helge, in general, that may be fine, but I have to take arbitrary strings from the user and can not risk the input being meaningful to regex. Of course, I guess I could write a loop and put a backslash in front of each and every character... At that point, I might as well do the above (IMHO).
  • Holger Adam
    Holger Adam over 11 years
    Please watch out when using Regex.Escape in Regex.Replace. You'll have to escape all of the three strings passed and call Regex.Unescape on the result!
  • James Manning
    James Manning over 11 years
    @Jim - I agree on using this solution instead, but just in case you ever need it, you can use Regex.Escape to escape regex-important characters for you.
  • Jim
    Jim over 11 years
    @JamesManning - Hmm, interesting--didn't know about Escape(). Thanks.
  • Ishmael
    Ishmael about 11 years
    While unit testing this I ran into the case where it would never return when oldValue == newValue == "".
  • goodeye
    goodeye about 11 years
    For the case of oldValue = "", String.Replace doesn't allow it. I added exception checks to match String.Replace exceptions: if (oldValue == null) { throw new ArgumentNullException("oldValue"); } if (oldValue == "") { throw new ArgumentException("String cannot be of zero length.", "oldValue"); }
  • CleverPatrick
    CleverPatrick almost 11 years
    It works. You need to add a reference to the Microsoft.VisualBasic assembly.
  • Walden Leverich
    Walden Leverich almost 11 years
    Great work here. I turned it into an extension method, but more importantly, I added a fast-out at the top in the case where str doesn't contain oldValue. Just move the int index = str.IndexOf(oldValue, comparison); to the first line of the method and return str if index == -1
  • Kiquenet
    Kiquenet over 10 years
    Which is better way ? what's about stackoverflow.com/a/244933/206730 ? better performance?
  • Michael Liu
    Michael Liu over 10 years
    This is buggy; ReplaceString("œ", "oe", "", StringComparison.InvariantCulture) throws ArgumentOutOfRangeException.
  • Jaycee
    Jaycee over 10 years
    Just learn Regex, keep the code clean. This is a trivial example but still looks complicated. People obsessing about speed and then writing crummy code like this is unfortunate.
  • crokusek
    crokusek about 10 years
    @Jaycee, having to escape the replacement string by default does not look like clean code to me. Also I'm sure the actual Regex implementation itself looks way more complicated and probably had numerous bugs in its initial versions. I do hope a final bug free version is posted.
  • Jaycee
    Jaycee almost 10 years
    @crokusek I use regex extensively and there are no bugs I have noticed. You are much more likely to introduce a bug with custom code like this.
  • Aheho
    Aheho almost 10 years
    Could you explain why you're multiplying by MatchNo?
  • Brandon
    Brandon almost 10 years
    If there is a difference in length between the oldValue and newValue, the string will get longer or shorter as you replace values. match.Index refers to the original location within the string, we need to adjust for that positions movement due to our replacement. Another approach would be to execute the Remove/Insert from right to left.
  • Aheho
    Aheho almost 10 years
    I get that. That's what the "offset" variable is for. What I don't understand is why you are multiplying by matchNo. My intuition tells me that the location of a match within a string would have no relation to the actual count of previous occurrences.
  • Aheho
    Aheho almost 10 years
    Never mind, I get it now. The offset needs to be scaled based on the # of occurrences. If you are losing 2 characters each time you need to do a replace, you need to account for that when computing the parameters to the remove method
  • do0g
    do0g over 9 years
    Using StringBuilder in this way will likely not improve performance the way you intend; it will be initialised with a 16 character buffer and your loop will potentially cause a number of memory allocations and copies. You should initialise your StringBuilder to a suitable capacity before you begin appending strings to it.
  • Vad
    Vad over 9 years
    You may need to handle empty/null string cases.
  • Jeremy Thompson
    Jeremy Thompson about 9 years
    Strange that this method had some problems when I used it (characters at the beginning of line went missing). The most popular answer here from C. Dragon 76 worked as expected.
  • Brain2000
    Brain2000 almost 9 years
    The problem with this is it returns a NEW string even if a replacement isn't made, where the string.replace( ) returns a pointer to the same string. Can get inefficient if you're doing something like a form letter merge.
  • Ahmed
    Ahmed over 8 years
    @MichaelLiu What do you think of if(oldValue.Length > str.Length) return str; ... Any weird stuff this could cause. I've wrote a few tests, all are using OrdinalIgnoreCase and that workaround didn't break any. I might be missing some cases, of course, so what do you think?
  • Ahmed
    Ahmed over 8 years
    @MichaelLiu Here are the tests for this Replace gist.github.com/Galilyou/00dcd0dab2d2a050c30c
  • Michael Liu
    Michael Liu over 8 years
    @Galilyou: The problem I pointed out isn't with the length check; the problem is with IndexOf and StringComparison.InvariantCulture.
  • Bronek
    Bronek over 8 years
    According to msdn: "Character escapes are recognized in regular expression patterns but not in replacement patterns." ( msdn.microsoft.com/en-us/library/4edbef7e.aspx )
  • RWC
    RWC over 8 years
    Mutiple errors in this solution: 1. Check originalString, oldValue and newValue for null. 2. Do not give orginalString back (does not work, simple types are not passed by reference), but assign the value of orginalValue first to a new string and modify it and give it back.
  • ChrisG
    ChrisG almost 8 years
    +1 for not using regex when its not necessary. Sure, you use a few more lines of code, but its much more efficient than regex-based replace unless you need the $ functionality.
  • swe
    swe about 7 years
    @HolgerAdam hm, i cannnot get your comment. "Regex.Replace("a[b]b", Regex.Escape("]B"), Regex.Escape("]C"), RegexOptions.IgnoreCase)" returns a[b]C, as expected. Can you explain why you think one needs to escape the input and unescape after?
  • Skorek
    Skorek almost 7 years
    It's best to use: string value = Regex.Replace("%PolicyAmount%", Regex.Escape("%PolicyAmount%"), "$0".Replace("$", "$$"), RegexOptions.IgnoreCase); as replacement recognizes only dolar signs.
  • Der_Meister
    Der_Meister over 6 years
    Brain2000, you are wrong. All strings in .NET are immutable.
  • Julian
    Julian about 6 years
    note: ReplaceString("","","",StringComparison.CurrentCulture) will lead to an infinite loop!
  • Simon Hewitt
    Simon Hewitt almost 6 years
    Der_Meister, whilst what you say is correct, that doesn't make what Brain2000 said wrong.
  • Caltor
    Caltor about 5 years
    @WaldenLeverich So with your fast-out you avoid instantiating the StringBuilder but introduce an if statement? Seems a micro-optimisation (if that) to me.
  • Walden Leverich
    Walden Leverich about 5 years
    @Caltor It's not just the constructor, but the string copy after the while() and then the .ToString back to a string as well. These things add up. But more importantly that that, there's the developer benefit of quickly seeing what happens if there's no match. BTW, check MS's code for similar checks, the do fast-exit checks as well.