Verify if String matches a format String
Solution 1
I don't know of a library that does that. Here is an example how to convert a format pattern into a regex. Notice that Pattern.quote
is important to handle accidental regexes in the format string.
// copied from java.util.Formatter
// %[argument_index$][flags][width][.precision][t]conversion
private static final String formatSpecifier
= "%(\\d+\\$)?([-#+ 0,(\\<]*)?(\\d+)?(\\.\\d+)?([tT])?([a-zA-Z%])";
private static final Pattern formatToken = Pattern.compile(formatSpecifier);
public Pattern convert(final String format) {
final StringBuilder regex = new StringBuilder();
final Matcher matcher = formatToken.matcher(format);
int lastIndex = 0;
regex.append('^');
while (matcher.find()) {
regex.append(Pattern.quote(format.substring(lastIndex, matcher.start())));
regex.append(convertToken(matcher.group(1), matcher.group(2), matcher.group(3),
matcher.group(4), matcher.group(5), matcher.group(6)));
lastIndex = matcher.end();
}
regex.append(Pattern.quote(format.substring(lastIndex, format.length())));
regex.append('$');
return Pattern.compile(regex.toString());
}
Of course, implementing convertToken
will be a challenge. Here is something to start with:
private static String convertToken(String index, String flags, String width, String precision, String temporal, String conversion) {
if (conversion.equals("s")) {
return "[\\w\\d]*";
} else if (conversion.equals("d")) {
return "[\\d]{" + width + "}";
}
throw new IllegalArgumentException("%" + index + flags + width + precision + temporal + conversion);
}
Solution 2
You can use Java regular expressions - please see http://www.vogella.de/articles/JavaRegularExpressions/article.html
Thanks...
Solution 3
Since you do not know the format in advance, you will have to write a method that converts a format string into a regexp. Not trivial, but possible. Here is a simple example for the 2 testcases you have given:
public static String getRegexpFromFormatString(String format)
{
String toReturn = format;
// escape some special regexp chars
toReturn = toReturn.replaceAll("\\.", "\\\\.");
toReturn = toReturn.replaceAll("\\!", "\\\\!");
if (toReturn.indexOf("%") >= 0)
{
toReturn = toReturn.replaceAll("%s", "[\\\\w]+"); //accepts 0-9 A-Z a-z _
while (toReturn.matches(".*%([0-9]+)[d]{1}.*"))
{
String digitStr = toReturn.replaceFirst(".*%([0-9]+)[d]{1}.*", "$1");
int numDigits = Integer.parseInt(digitStr);
toReturn = toReturn.replaceFirst("(.*)(%[0-9]+[d]{1})(.*)", "$1[0-9]{" + numDigits + "}$3");
}
}
return "^" + toReturn + "$";
}
and some test code:
public static void main(String[] args) throws Exception
{
String formats[] = {"hello %s!", "song%03d.mp3", "song%03d.mp3"};
for (int i=0; i<formats.length; i++)
{
System.out.println("Format in [" + i + "]: " + formats[i]);
System.out.println("Regexp out[" + i + "]: " + getRegexp(formats[i]));
}
String[] words = {"hello world!", "song001.mp3", "potato"};
for (int i=0; i<formats.length; i++)
{
System.out.println("Word [" + i + "]: " + words[i] +
" : matches=" + words[i].matches(getRegexpFromFormatString(formats[i])));
}
}
Solution 4
There is not a simple way to do this. A straight-forward way would be to write some code that converts format strings (or a simpler subset of them) to regular expressions and then match those using the standard regular expression classes.
A better way is probably to rethink/refactor your code. Why do you want this?
hpique
iOS, Android & Mac developer. Founder of Robot Media. @hpique
Updated on August 26, 2020Comments
-
hpique over 3 years
In Java, how can you determine if a String matches a format string (ie:
song%03d.mp3
)?In other words, how would you implement the following function?
/** * @return true if formatted equals String.format(format, something), false otherwise. **/ boolean matches(String formatted, String format);
Examples:
matches("hello world!", "hello %s!"); // true matches("song001.mp3", "song%03d.mp3"); // true matches("potato", "song%03d.mp3"); // false
Maybe there's a way to convert a format string into a regex?
Clarification
The format String is a parameter. I don't know it in advance.
song%03d.mp3
is just an example. It could be any other format string.If it helps, I can assume that the format string will only have one parameter.
-
hpique over 12 yearsThe format String is a parameter. I don't know it in advance. song%03d.mp3 was just an example.
-
hpique over 12 yearsThe format String is a parameter. I don't know it in advance. song%03d.mp3 was just an example.
-
hpique over 12 yearsThe format String is a parameter. I don't know it in advance. song%03d.mp3 was just an example.
-
hpique over 12 yearsAnd how do you convert a generic format string into a pattern?
-
hpique over 12 yearsI'd love to use regular expressions, but what I'm given is a format string.
-
Yhn over 12 yearsHence my comment about replacing format codes like %03d with their regular expression equivalent :). The page you linked completely defines the possible codes and prefixes, you'd need to write a function that searches those codes and replaces them.a %d would be replaced with \d+; %03d could become \d{3}\d? (to ensure a minumum of 3, but possibly "infinite" digits.
-
hpique over 12 yearsThat's what I would like to avoid. I didn't write the whole code.
-
dtech over 12 yearsThere is definitly no native way to do this. So you either need to rethink your input/code, write your own converter or find a converter that does this. E.g. why exactly does you need to use a format string?
-
hpique over 12 yearsWe choose format strings because they're pretty much the same across all platforms, unlike regex.
-
Mario Duarte over 12 yearsWell, if you're giving it as a parameter for a java application why don't you just use Java regexps?
-
Vishwas Mehra over 12 years@hgpc ok I've modified my answer appropriately. It's more than I would usually do for a SO answer but I was intrigued. :) You would have to perfect/complete this for production use but it is an idea for how to approach this if necessary.
-
rascio over 12 yearsBut you have to write a regex...this is what i don't understand...how is created the regex? you need something that creates the regex automatically? or you need something that checks if a string contains a regex?
-
hpique over 12 yearsBecause the Java app is one of the many clients that receive this input.
-
hpique over 12 years+1 This is more or less what I'm doing right now. Thanks for posting code.
-
dtech over 12 yearsPerl5 compatible regular expressions are implemented in nearly all programming languages. But if you have to do this the only thing you can do is write a converter. It's not that hard. Also note that you're using the wrong format for the job. Format strings are only intended for data -> string(s). Regular expressions are broader. What you're doing now is basically re-inventing an impractical regex notation.
-
Cephalopod over 12 yearsIf you want to be a hero, you can publish your code as open source.
-
hpique over 12 yearsI'm no stranger to publishing open-source code, but this is too specific to publish.