What is the best way to match fully qualified Java class name in a text?
Examples: java.lang.Reflect
, java.util.ArrayList
, org.hiber
Following expression works perfectly fine for me.
^[a-z][a-z0-9_]*(\.[a-z0-9_]+)+$
The following class validates that a provided package name is valid:
import java.util.HashSet;
public class ValidationUtils {
// All Java reserved words that must not be used in a valid package name.
private static final HashSet reserved;
static {
reserved = new HashSet();
reserved.add("abstract");reserved.add("assert");reserved.add("boolean");
reserved.add("break");reserved.add("byte");reserved.add("case");
reserved.add("catch");reserved.add("char");reserved.add("class");
reserved.add("const");reserved.add("continue");reserved.add("default");
reserved.add("do");reserved.add("double");reserved.add("else");
reserved.add("enum");reserved.add("extends");reserved.add("false");
reserved.add("final");reserved.add("finally");reserved.add("float");
reserved.add("for");reserved.add("if");reserved.add("goto");
reserved.add("implements");reserved.add("import");reserved.add("instanceof");
reserved.add("int");reserved.add("interface");reserved.add("long");
reserved.add("native");reserved.add("new");reserved.add("null");
reserved.add("package");reserved.add("private");reserved.add("protected");
reserved.add("public");reserved.add("return");reserved.add("short");
reserved.add("static");reserved.add("strictfp");reserved.add("super");
reserved.add("switch");reserved.add("synchronized");reserved.add("this");
reserved.add("throw");reserved.add("throws");reserved.add("transient");
reserved.add("true");reserved.add("try");reserved.add("void");
reserved.add("volatile");reserved.add("while");
}
/**
* Checks if the string that is provided is a valid Java package name (contains only
* [a-z,A-Z,_,$], every element is separated by a single '.' , an element can't be one of Java's
* reserved words.
*
* @param name The package name that needs to be validated.
* @return <b>true</b> if the package name is valid, <b>false</b> if its not valid.
*/
public static final boolean isValidPackageName(String name) {
String[] parts=name.split("\\.",-1);
for (String part:parts){
System.out.println(part);
if (reserved.contains(part)) return false;
if (!validPart(part)) return false;
}
return true;
}
/**
* Checks that a part (a word between dots) is a valid part to be used in a Java package name.
* @param part The part between dots (e.g. *PART*.*PART*.*PART*.*PART*).
* @return <b>true</b> if the part is valid, <b>false</b> if its not valid.
*/
private static boolean validPart(String part){
if (part==null || part.length()<1){
// Package part is null or empty !
return false;
}
if (Character.isJavaIdentifierStart(part.charAt(0))){
for (int i = 0; i < part.length(); i++){
char c = part.charAt(i);
if (!Character.isJavaIdentifierPart(c)){
// Package part contains invalid JavaIdentifier !
return false;
}
}
}else{
// Package part does not begin with a valid JavaIdentifier !
return false;
}
return true;
}
}
A Java fully qualified class name (lets say "N") has the structure
N.N.N.N
The "N" part must be a Java identifier. Java identifiers cannot start with a number, but after the initial character they may use any combination of letters and digits, underscores or dollar signs:
([a-zA-Z_$][a-zA-Z\d_$]*\.)*[a-zA-Z_$][a-zA-Z\d_$]*
------------------------ -----------------------
N N
They can also not be a reserved word (like import
, true
or null
). If you want to check plausibility only, the above is enough. If you also want to check validity, you must check against a list of reserved words as well.
Java identifiers may contain any Unicode letter instead of "latin only". If you want to check for this as well, use Unicode character classes:
([\p{Letter}_$][\p{Letter}\p{Number}_$]*\.)*[\p{Letter}_$][\p{Letter}\p{Number}_$]*
or, for short
([\p{L}_$][\p{L}\p{N}_$]*\.)*[\p{L}_$][\p{L}\p{N}_$]*
The Java Language Specification, (section 3.8) has all details about valid identifier names.
Also see the answer to this question: Java Unicode variable names