How to implement a SQL like 'LIKE' operator in java?
Solution 1
.* will match any characters in regular expressions
I think the java syntax would be
"digital".matches(".*ital.*");
And for the single character match just use a single dot.
"digital".matches(".*gi.a.*");
And to match an actual dot, escape it as slash dot
\.
Solution 2
Yes, this could be done with a regular expression. Keep in mind that Java's regular expressions have different syntax from SQL's "like". Instead of "%
", you would have ".*
", and instead of "?
", you would have ".
".
What makes it somewhat tricky is that you would also have to escape any characters that Java treats as special. Since you're trying to make this analogous to SQL, I'm guessing that ^$[]{}\
shouldn't appear in the regex string. But you will have to replace ".
" with "\\.
" before doing any other replacements. (Edit: Pattern.quote(String)
escapes everything by surrounding the string with "\Q
" and "\E
", which will cause everything in the expression to be treated as a literal (no wildcards at all). So you definitely don't want to use it.)
Furthermore, as Dave Webb says, you also need to ignore case.
With that in mind, here's a sample of what it might look like:
public static boolean like(String str, String expr) {
expr = expr.toLowerCase(); // ignoring locale for now
expr = expr.replace(".", "\\."); // "\\" is escaped to "\" (thanks, Alan M)
// ... escape any other potentially problematic characters here
expr = expr.replace("?", ".");
expr = expr.replace("%", ".*");
str = str.toLowerCase();
return str.matches(expr);
}
Solution 3
Regular expressions are the most versatile. However, some LIKE functions can be formed without regular expressions. e.g.
String text = "digital";
text.startsWith("dig"); // like "dig%"
text.endsWith("tal"); // like "%tal"
text.contains("gita"); // like "%gita%"
Solution 4
Every SQL reference I can find says the "any single character" wildcard is the underscore (_
), not the question mark (?
). That simplifies things a bit, since the underscore is not a regex metacharacter. However, you still can't use Pattern.quote()
for the reason given by mmyers. I've got another method here for escaping regexes when I might want to edit them afterward. With that out of the way, the like()
method becomes pretty simple:
public static boolean like(final String str, final String expr)
{
String regex = quotemeta(expr);
regex = regex.replace("_", ".").replace("%", ".*?");
Pattern p = Pattern.compile(regex,
Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
return p.matcher(str).matches();
}
public static String quotemeta(String s)
{
if (s == null)
{
throw new IllegalArgumentException("String cannot be null");
}
int len = s.length();
if (len == 0)
{
return "";
}
StringBuilder sb = new StringBuilder(len * 2);
for (int i = 0; i < len; i++)
{
char c = s.charAt(i);
if ("[](){}.*+?$^|#\\".indexOf(c) != -1)
{
sb.append("\\");
}
sb.append(c);
}
return sb.toString();
}
If you really want to use ?
for the wildcard, your best bet would be to remove it from the list of metacharacters in the quotemeta()
method. Replacing its escaped form -- replace("\\?", ".")
-- wouldn't be safe because there might be backslashes in the original expression.
And that brings us to the real problems: most SQL flavors seem to support character classes in the forms [a-z]
and [^j-m]
or [!j-m]
, and they all provide a way to escape wildcard characters. The latter is usually done by means of an ESCAPE
keyword, which lets you define a different escape character every time. As you can imagine, this complicates things quite a bit. Converting to a regex is probably still the best option, but parsing the original expression will be much harder--in fact, the first thing you would have to do is formalize the syntax of the LIKE
-like expressions themselves.
Solution 5
To implement LIKE functions of sql in java you don't need regular expression in They can be obtained as:
String text = "apple";
text.startsWith("app"); // like "app%"
text.endsWith("le"); // like "%le"
text.contains("ppl"); // like "%ppl%"
Chris
Updated on July 09, 2022Comments
-
Chris almost 2 years
I need a comparator in java which has the same semantics as the sql 'like' operator. For example:
myComparator.like("digital","%ital%"); myComparator.like("digital","%gi?a%"); myComparator.like("digital","digi%");
should evaluate to true, and
myComparator.like("digital","%cam%"); myComparator.like("digital","tal%");
should evaluate to false. Any ideas how to implement such a comparator or does anyone know an implementation with the same semantics? Can this be done using a regular expression?
-
Volkan Yazıcı about 6 yearsSee RegexUtil#sqlPatternToRegex(String) from Apache Cayenne project.
-
-
Chris about 15 yearsyeah, thanks! But in case the word ins't so simple like "%dig%" and the string needs some escping? Is there anything already exsiting? What about the '?' ?
-
Chris about 15 yearswhat abot "%this%string%"? split on the '%' sign, iterate over the array and than check for every entry? i think this could be done better ...
-
Bob about 15 yearsI edited my answer for the question mark operator. I am a little confused by the rest of your comment though. Are you saying the string is coming to you in sql syntax and you want to evaluate it as is? If that is the case I think you will need to replace to sql syntax manually.
-
Chris about 15 yearswhat if the string which is used as a search pattern contains grouping characters like '(' or ')' escape them too? how mayn other characters needs escaping?
-
Bob about 15 yearsI think that will depend on how many options you are allowing.
-
Chris about 15 yearsexists there a method, which escapes every charachter with special meaning in java regex?
-
palantus about 15 yearsYes, Pattern.quote (java.sun.com/javase/6/docs/api/java/util/regex/… ) will do it. For some reason, I thought that might cause a problem, but now I don't know why I didn't include it in the answer.
-
palantus about 15 yearsOh yes, now I remember. It's because ? is a special regex character, so it would be escaped before we could replace it. I suppose we could instead use Pattern.quote and then expr = expr.replace("\\?", ".");
-
GreenieMeanie about 15 yearsJust beware that .* is greedy(.*? might be more approriate). I don't think .* in regex is exactly the same semantics as % in SQL.
-
Alan Moore almost 15 yearsYour inner split() and loop replaces any \? sequence with a dot--I don't get that. Why single out that sequence, only to replace it with a dot just like a lone question mark?
-
tommyL almost 15 yearsit replaces the '?' with a '.' because '?' is a place holder for a single arbitrary character. i know '\\\\\\?' looks strange but i testedt it and for my tests it seems to work.
-
tommyL almost 15 yearsdo you know if hibernate does support this feature? i mean, to filter objects currently in memory using such an expression?
-
True Soft almost 12 yearsYou can add also
expr = expr.replaceAll("(?<!\\\\)_", ".");
, because"\_"
can be escaped in SQL, and should not be replaced with"."
in this case. (I used_
instead of?
for one character.) -
True Soft almost 12 yearsAlso, for
%
, this replacement would be better:expr = expr.replaceAll("(?<!\\\\)%", ".*");
-
Leo over 7 yearsif(s == null) throw new IllegalArgumentException("String cannot be null"); else if(s.isEmpty()) return "";
-
Pang about 7 yearsThis is essentially just a repeat of this existing answers posted many years ago.
-
Christian about 4 yearsOh really? And what about if text was "I like apples but not oranges" and the search is something like "%oranges%apples%"