List of all special characters that need to be escaped in a regex
Solution 1
You can look at the javadoc of the Pattern class: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html
You need to escape any char listed there if you want the regular char and not the special meaning.
As a maybe simpler solution, you can put the template between \Q and \E - everything between them is considered as escaped.
Solution 2
- Java characters that have to be escaped in regular expressions are:
\.[]{}()<>*+-=!?^$|
- Two of the closing brackets (
]
and}
) are only need to be escaped after opening the same type of bracket. - In
[]
-brackets some characters (like+
and-
) do sometimes work without escape.
Solution 3
To escape you could just use this from Java 1.5:
Pattern.quote("$test");
You will match exacty the word $test
Solution 4
According to the String Literals / Metacharacters documentation page, they are:
<([{\^-=$!|]})?*+.>
Also it would be cool to have that list refereed somewhere in code, but I don't know where that could be...
Solution 5
Combining what everyone said, I propose the following, to keep the list of characters special to RegExp clearly listed in their own String, and to avoid having to try to visually parse thousands of "\\"'s. This seems to work pretty well for me:
final String regExSpecialChars = "<([{\\^-=$!|]})?*+.>";
final String regExSpecialCharsRE = regExSpecialChars.replaceAll( ".", "\\\\$0");
final Pattern reCharsREP = Pattern.compile( "[" + regExSpecialCharsRE + "]");
String quoteRegExSpecialChars( String s)
{
Matcher m = reCharsREP.matcher( s);
return m.replaceAll( "\\\\$0");
}
Avinash Nair
Updated on August 24, 2020Comments
-
Avinash Nair about 2 years
I am trying to create an application that matches a message template with a message that a user is trying to send. I am using Java regex for matching the message. The template/message may contain special characters.
How would I get the complete list of special characters that need to be escaped in order for my regex to work and match in the maximum possible cases?
Is there a universal solution for escaping all special characters in Java regex?
-
mkdev about 9 yearsIf you find \Q and \E hard to remember you can use instead Pattern.quote("...")
-
Aleksandr Dubinsky over 8 yearsI wish you'd actually stated them
-
Sorin over 8 yearsWhy, @AleksandrDubinsky ?
-
Aleksandr Dubinsky over 8 years@Sorin Because it is the spirit (nay, policy?) of Stack Exchange to state the answer in your answer rather than just linking to an off-site resource. Besides, that page doesn't have a clear list either. A list can be found here: docs.oracle.com/javase/tutorial/essential/regex/literals.html, yet it states "In certain situations the special characters listed above will not be treated as metacharacters," without explaining what will happen if one tries to escape them. In short, this question deserves a good answer.
-
fracz about 8 years
String escaped = regexString.replaceAll("([\\\\\\.\\[\\{\\(\\*\\+\\?\\^\\$\\|])", "\\\\$1");
-
nhahtdh over 7 years
)
also has to be escaped, and depending on whether you are inside or outside of a character class, there can be more characters to escape, in which casePattern.quote
does quite a good job at escaping a string for use both inside and outside of character class. -
Dominika over 6 yearsIs there any way to not escape but allow those characters?
-
marbel82 over 6 years
String escaped = tnk.replaceAll("[\\<\\(\\[\\{\\\\\\^\\-\\=\\$\\!\\|\\]\\}\\)\\?\\*\\+\\.\\>]", "\\\\$0");
-
Tobi G. over 6 yearsEscaping a character means to allow the character instead of interpreting it as an operator.
-
Kenston Choi about 6 yearsUnescaped
-
within[]
may not always work since it is used to define ranges. It's safer to escape it. For example, the patterns[-]
and[-)]
match the string-
but not with[(-)]
. -
Sasha about 6 years"everything between them [
\Q
and\E
] is considered as escaped" — except other\Q
's and\E
's (which potentially may occur within original regex). So, it's better to usePattern.quote
as suggested here and not to reinvent the wheel. -
Joe Bowbeer over 5 yearsThe Pattern javadoc says it is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct, but a backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct. Therefore a much simpler regex will suffice:
s.replaceAll("[\\W]", "\\\\$0")
where\W
designates non-word characters. -
Old Nick almost 4 yearsEven though the accepted answer does answer the question, this answer was more helpful to me when I was just looking for a quick list.
-
Volksman about 3 yearsWhy is this not the most highly rated answer? It solves the problem without going into the complex details of listing all characters that needs escaping and it's part of the JDK - no need to write any extra code! Simple!
-
Hawk over 2 years
-=!
do not necessarily need to be escaped, it depends on the context. For example as a single letter they work as a constant regex. -
aeskreis 11 monthsSaved me some time, thank you!
-
Asher A 10 monthsWhat if a regex contains \E? how can it be escaped? e.g: "\\Q\\Eeee\\E" throws a java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 4