String replaceAll() vs. Matcher replaceAll() (Performance differences)

70,003

Solution 1

According to the documentation for String.replaceAll, it has the following to say about calling the method:

An invocation of this method of the form str.replaceAll(regex, repl) yields exactly the same result as the expression

Pattern.compile(regex).matcher(str).replaceAll(repl)

Therefore, it can be expected the performance between invoking the String.replaceAll, and explicitly creating a Matcher and Pattern should be the same.

Edit

As has been pointed out in the comments, the performance difference being non-existent would be true for a single call to replaceAll from String or Matcher, however, if one needs to perform multiple calls to replaceAll, one would expect it to be beneficial to hold onto a compiled Pattern, so the relatively expensive regular expression pattern compilation does not have to be performed every time.

Solution 2

Source code of String.replaceAll():

public String replaceAll(String regex, String replacement) {
    return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

It has to compile the pattern first - if you're going to run it many times with the same pattern on short strings, performance will be much better if you reuse one compiled Pattern.

Solution 3

The main difference is that if you hold onto the Pattern used to produce the Matcher, you can avoid recompiling the regex every time you use it. Going through String, you don't get the ability to "cache" like this.

If you have a different regex every time, using the String class's replaceAll is fine. If you are applying the same regex to many strings, create one Pattern and reuse it.

Solution 4

Immutability / thread safety: compiled Patterns are immutable, Matchers are not. (see Is Java Regex Thread Safe?)

Handling empty strings: replaceAll should handle empty strings gracefully (it won't match an empty input string pattern)

Making coffee, etc.: last I heard, neither String nor Pattern nor Matcher had any API features for that.

edit: as for handling NULLs, the documentation for String and Pattern doesn't explicitly say so, but I suspect they'd throw a NullPointerException since they expect a String.

Solution 5

The difference is that String.replaceAll() compiles the regex each time it's called. There's no equivalent for .NET's static Regex.Replace() method, which automatically caches the compiled regex. Usually, replaceAll() is something you do only once, but if you're going to be calling it repeatedly with the same regex, especially in a loop, you should create a Pattern object and use the Matcher method.

You can create the Matcher ahead of time, too, and use its reset() method to retarget it for each use:

Matcher m = Pattern.compile(regex).matcher("");
for (String s : targets)
{
  System.out.println(m.reset(s).replaceAll(repl));
}

The performance benefit of reusing the Matcher, of course, is nowhere as great as that of reusing the Pattern.

Share:
70,003
SuPra
Author by

SuPra

Updated on January 20, 2022

Comments

  • SuPra
    SuPra over 2 years

    Are there known difference(s) between String.replaceAll() and Matcher.replaceAll() (On a Matcher Object created from a Regex.Pattern) in terms of performance?

    Also, what are the high-level API 'ish differences between the both? (Immutability, Handling NULLs, Handling empty strings, etc.)