Regex using Java String.replaceAll
Solution 1
If it is a function that continuously you are using, there is a problem. Each regular expression is compiled again for each call. It is best to create them as constants. You could have something like this.
private static final Pattern[] patterns = {
Pattern.compile("</?i>"),
Pattern.compile("//"),
// Others
};
private static final String[] replacements = {
"",
"/",
// Others
};
public static String cleanString(String str) {
for (int i = 0; i < patterns.length; i++) {
str = patterns[i].matcher(str).replaceAll(replacements[i]);
}
return str;
}
Solution 2
cleanInst.replaceAll("[<i>]", "");
should be:
cleanInst = cleanInst.replaceAll("[<i>]", "");
since String
class is immutable and doesn't change its internal state, i.e. replaceAll()
returns a new instance that's different from cleanInst
.
Solution 3
You should read a basic regular expressions tutorial.
Until then, what you tried to do can be done like this:
cleanInst = cleanInst.replace("//", "/");
cleanInst = cleanInst.replaceAll("</?i>", "");
cleanInst = cleanInst.replaceAll("/n\\b", ";")
cleanInst = cleanInst.replaceAll("\\bPhysics Dept\\.", "Physics Department");
cleanInst = cleanInst.replaceAll("(?i)\\b(?:the )?dept\\b\\.?", "The Department");
You could probably chain all those replace operations (but I don't know the proper Java syntax for this).
About the word boundaries: \b
usually only makes sense directly before or after an alphanumeric character.
For example, \b/n\b
will only match /n
if it's directly preceded by an alphanumeric character and followed by a non-alphanumeric character, so it matches "a/n!"
but not "foo /n bar"
.
user2072797
Updated on July 09, 2022Comments
-
user2072797 almost 2 years
I am looking to replace a java string value as follows. below code is not working.
cleanInst.replaceAll("[<i>]", ""); cleanInst.replaceAll("[</i>]", ""); cleanInst.replaceAll("[//]", "/"); cleanInst.replaceAll("[\bPhysics Dept.\b]", "Physics Department"); cleanInst.replaceAll("[\b/n\b]", ";"); cleanInst.replaceAll("[\bDEPT\b]", "The Department"); cleanInst.replaceAll("[\bDEPT.\b]", "The Department"); cleanInst.replaceAll("[\bThe Dept.\b]", "The Department"); cleanInst.replaceAll("[\bthe dept.\b]", "The Department"); cleanInst.replaceAll("[\bThe Dept\b]", "The Department"); cleanInst.replaceAll("[\bthe dept\b]", "The Department"); cleanInst.replaceAll("[\bDept.\b]", "The Department"); cleanInst.replaceAll("[\bdept.\b]", "The Department"); cleanInst.replaceAll("[\bdept\b]", "The Department");
What is the easiest way to achieve the above replace?
-
stinepike about 11 yearswhat do you mean by not working?
-
Reinstate Monica -- notmaynard about 11 yearsRemove the square brackets (
[
and]
). These are for character classes. If something else is not working, you'll need to be more specific. -
fge about 11 yearsAre you aware of what a character class is in a regex? regex.info
-
SLaks about 11 yearsStrings are immutable.
-
Isaac about 11 yearsand Ignore Case modifier would work for a lot of the
dept
replaces -
jahroy about 11 yearsAs @SLaks has pointed out: Strings are immutable. Your code will do nothing if you don't store the return value of
String.replaceAll()
somewhere. Right now your code does nothing with the return value.
-
-
Bohemian about 11 years+1 your answer is pretty good, but why the non-capturing group for "the "? Is it just "performance"? Cos IMHO readability drops more than performance increases. Btw I suspect
/n
is meant to be\n
-
Tim Pietzcker about 11 yearsI'm just used to doing it like this. I never use capturing parentheses unless I want to capture a group. I agree that there's tension between stating one's intentions clearly and readability.
-
AxA over 7 yearsInstead of
Pattern
, we now haveMatcher
objects created every time. How is this better? -
Ade Miller over 6 yearsBecause compiling a regex Pattern is more costly than creating a Matcher for a (pre-compiled) Pattern?