In Java, how can you determine if a String matches a format string (ie: song%03d.mp3
)?
In other words, how would you implement the following function?
the string class has the matches method, you can pass a regex there. String.matches(String)
for the regex you can see this:
http://download.oracle.com/javase/1,5.0/docs/api/java/util/regex/Pattern.html
examples:
"song001.mp3".matches("song\\d{3}\\.mp3");
You can use Java regular expressions - please see http://www.vogella.de/articles/JavaRegularExpressions/article.html
Thanks...
Since you do not know the format in advance, you will have to write a method that converts a format string into a regexp. Not trivial, but possible. Here is a simple example for the 2 testcases you have given:
public static String getRegexpFromFormatString(String format)
{
String toReturn = format;
// escape some special regexp chars
toReturn = toReturn.replaceAll("\\.", "\\\\.");
toReturn = toReturn.replaceAll("\\!", "\\\\!");
if (toReturn.indexOf("%") >= 0)
{
toReturn = toReturn.replaceAll("%s", "[\\\\w]+"); //accepts 0-9 A-Z a-z _
while (toReturn.matches(".*%([0-9]+)[d]{1}.*"))
{
String digitStr = toReturn.replaceFirst(".*%([0-9]+)[d]{1}.*", "$1");
int numDigits = Integer.parseInt(digitStr);
toReturn = toReturn.replaceFirst("(.*)(%[0-9]+[d]{1})(.*)", "$1[0-9]{" + numDigits + "}$3");
}
}
return "^" + toReturn + "$";
}
and some test code:
public static void main(String[] args) throws Exception
{
String formats[] = {"hello %s!", "song%03d.mp3", "song%03d.mp3"};
for (int i=0; i<formats.length; i++)
{
System.out.println("Format in [" + i + "]: " + formats[i]);
System.out.println("Regexp out[" + i + "]: " + getRegexp(formats[i]));
}
String[] words = {"hello world!", "song001.mp3", "potato"};
for (int i=0; i<formats.length; i++)
{
System.out.println("Word [" + i + "]: " + words[i] +
" : matches=" + words[i].matches(getRegexpFromFormatString(formats[i])));
}
}
There is not a simple way to do this. A straight-forward way would be to write some code that converts format strings (or a simpler subset of them) to regular expressions and then match those using the standard regular expression classes.
A better way is probably to rethink/refactor your code. Why do you want this?
You can use String.matches; although you'd need to use a regular expression then, rather then the format string.
It shouldn't be too hard to replace something like %03d with a \d{3} regex equivalent
Example:
"song001.mp3".matches("song\\d{3}\\.mp3") // True
"potato".matches("song\\d{3}\\.mp3") // False
If you really need the format string, you'll need to make a function that replaces the format with a regex equivalent, and escapes the regex reserved characters; then use the String.matches function.
I don't know of a library that does that. Here is an example how to convert a format pattern into a regex. Notice that Pattern.quote
is important to handle accidental regexes in the format string.
// copied from java.util.Formatter
// %[argument_index$][flags][width][.precision][t]conversion
private static final String formatSpecifier
= "%(\\d+\\$)?([-#+ 0,(\\<]*)?(\\d+)?(\\.\\d+)?([tT])?([a-zA-Z%])";
private static final Pattern formatToken = Pattern.compile(formatSpecifier);
public Pattern convert(final String format) {
final StringBuilder regex = new StringBuilder();
final Matcher matcher = formatToken.matcher(format);
int lastIndex = 0;
regex.append('^');
while (matcher.find()) {
regex.append(Pattern.quote(format.substring(lastIndex, matcher.start())));
regex.append(convertToken(matcher.group(1), matcher.group(2), matcher.group(3),
matcher.group(4), matcher.group(5), matcher.group(6)));
lastIndex = matcher.end();
}
regex.append(Pattern.quote(format.substring(lastIndex, format.length())));
regex.append('$');
return Pattern.compile(regex.toString());
}
Of course, implementing convertToken
will be a challenge. Here is something to start with:
private static String convertToken(String index, String flags, String width, String precision, String temporal, String conversion) {
if (conversion.equals("s")) {
return "[\\w\\d]*";
} else if (conversion.equals("d")) {
return "[\\d]{" + width + "}";
}
throw new IllegalArgumentException("%" + index + flags + width + precision + temporal + conversion);
}