I\'m trying to create a regular expression that detect a new class for example:
public interface IGame {
or
private class Game
Change your regex like below to match both type of string formats.
line.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{");
Example:
String s1 = "public interface IGame {";
String s2 = "private class Game {";
System.out.println(s1.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{"));
System.out.println(s2.matches("(?:public|protected|private|static)\\s+(?:class|interface)\\s+\\w+\\s*\\{"));
Output:
true
true
This regex is an incomplete specification of Java class and interface declaration. However, it can match declarations like this:
abstract class X<B extends Integer,D extends java.io.InputStream,R extends Comparator<? super D>>extends java.util.ArrayList<Integer>implements java.util.Queue<Integer>,Serializable{}
public@Deprecated interface K<D> extends Comparable<Integer>{}
So it should be sufficient for most purpose.
Please read the code to generate the regex at the end to know exactly what was skipped. In summary, ReferenceType
, Annotation
are not fully incorporated into this regex since they would require recursive regex to parse properly.
The Java regex to match Java class declaration in all its g(l)ory details:
(?:((?:public|protected|private|abstract|static|final|strictfp|@[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)(?:[ \t\f\r\n]*+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))(?:public|protected|private|abstract|static|final|strictfp|@[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))*+)[ \t\f\r\n]*+)?+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))class[ \t\f\r\n]++(\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)[ \t\f\r\n]*+(?:(<[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]++extends[ \t\f\r\n]++(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+|\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))?+(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]++extends[ \t\f\r\n]++(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+|\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))?+)*+[ \t\f\r\n]*+>)[ \t\f\r\n]*+)?+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))(?:(extends[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+)[ \t\f\r\n]*+)?+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))(?:(implements[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+)*+[ \t\f\r\n]*+))?+[{]
And for Java interface declaration:
(?:((?:public|protected|private|abstract|static|strictfp|@[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)(?:[ \t\f\r\n]*+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))(?:public|protected|private|abstract|static|strictfp|@[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))*+)[ \t\f\r\n]*+)?+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))interface[ \t\f\r\n]++(\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)[ \t\f\r\n]*+(?:(<[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]++extends[ \t\f\r\n]++(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+|\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))?+(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]++extends[ \t\f\r\n]++(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+|\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+))?+)*+[ \t\f\r\n]*+>)[ \t\f\r\n]*+)?+(?:(?<!\p{javaJavaIdentifierPart})|(?!\p{javaJavaIdentifierPart}))(?:(extends[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+(?:[ \t\f\r\n]*+[.][ \t\f\r\n]*+\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+(?:[ \t\f\r\n]*+<[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+)(?:[ \t\f\r\n]*+,[ \t\f\r\n]*+(?:\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+|[?][ \t\f\r\n]*+(?:(?:extends|super)[ \t\f\r\n]++\p{javaJavaIdentifierStart}\p{javaJavaIdentifierPart}*+)?+))*+[ \t\f\r\n]*+>)?+)*+)*+)[ \t\f\r\n]*+)?+[{]
It is generated by building up the pattern piece by piece. I reference Java Language Specification for Java SE 7 for the grammar of NormalClassDeclaration
and NormalInterfaceDeclaration.
Here is the code to generate the monster above, including comments on which pattern is skipped:
// All rules never end with WhiteSpace. This allows us to check for error easier.
final static String Identifier = "\\p{javaJavaIdentifierStart}\\p{javaJavaIdentifierPart}*+";
// Heuristic to exclude any position where adding space would split an Identifier or a keyword apart
final static String TokenBoundary = "(?:(?<!\\p{javaJavaIdentifierPart})|(?!\\p{javaJavaIdentifierPart}))";
// Optional whitespace
final static String WhiteSpaceOpt = "[ \\t\\f\\r\\n]*+";
// Compulsory whitespace
final static String WhiteSpaceCom = "[ \\t\\f\\r\\n]++";
final static String MarkerAnnotation = "@" + WhiteSpaceOpt + Identifier;
// Skipped: SingleElementAnnotation, NormalAnnotation
// Can't be included due to middle recursion
final static String Annotation = MarkerAnnotation;
final static String ClassModifier = "(?:public|protected|private|abstract|static|final|strictfp|" + Annotation + ")";
// Since the declaration
// public@Deprecated ...
// is allowed, the space between modifiers is optional.
// Since allowing the space to be optional recognizes this invalid declaration
// publicstatic ...
// we need to assert boundary between 2 class modifiers
final static String ClassModifiers = ClassModifier + "(?:" + WhiteSpaceOpt + TokenBoundary + ClassModifier + ")*+";
final static String TypeVariable = Identifier;
// Skipped: ArrayType, ClassOrInterfaceType
// Can't be included, since the grammar is context-free, and this is where middle recursion occurs
final static String ReferenceType = TypeVariable;
final static String WildcardBounds = "(?:extends|super)" + WhiteSpaceCom + ReferenceType;
final static String Wildcard = "[?]" + WhiteSpaceOpt + "(?:" + WildcardBounds + ")?+";
final static String TypeArgument = "(?:" + ReferenceType + "|" + Wildcard + ")";
final static String TypeArgumentList = TypeArgument + "(?:" + WhiteSpaceOpt + "," + WhiteSpaceOpt + TypeArgument + ")*+";
final static String TypeArguments = "<" + WhiteSpaceOpt + TypeArgumentList + WhiteSpaceOpt + ">";
final static String TypeName = Identifier + "(?:" + WhiteSpaceOpt + "[.]" + WhiteSpaceOpt + Identifier + ")*+";
// Expanded definition of ClassOrInterfaceType = ClassType = InterfaceType
// ClassOrInterfaceType -> TypeName TypeArguments<opt>
// ClassOrInterfaceType -> ClassOrInterfaceType . Identifier TypeArguments<opt>
final static String ClassType = TypeName + "(?:" + WhiteSpaceOpt + TypeArguments + ")?+" +
"(?:" + WhiteSpaceOpt + "[.]" + WhiteSpaceOpt + Identifier + "(?:" + WhiteSpaceOpt + TypeArguments + ")?+" + ")*+";
// Definition of ClassType and InterfaceType are identical
final static String InterfaceType = ClassType;
final static String ClassOrInterfaceType = ClassType;
final static String TypeBound = "extends" + WhiteSpaceCom + "(?:" + ClassOrInterfaceType + "|" + TypeVariable + ")";
final static String TypeParameter = TypeVariable + "(?:" + WhiteSpaceCom + TypeBound + ")?+";
final static String TypeParameterList = TypeParameter + "(?:" + WhiteSpaceOpt + "," + WhiteSpaceOpt + TypeParameter + ")*+";
final static String TypeParameters = "<" + WhiteSpaceOpt + TypeParameterList + WhiteSpaceOpt + ">";
final static String Super = "extends" + WhiteSpaceCom + ClassType;
final static String InterfaceTypeList = InterfaceType + "(?:" + WhiteSpaceOpt + "," + WhiteSpaceOpt + InterfaceType + ")*+";
final static String Interfaces = "implements" + WhiteSpaceCom + InterfaceTypeList;
final static String NormalClassDeclaration =
// Annotation in its fullest form can end in ), so WhiteSpaceOpt is used here for pedantic
// It can be changed to WhiteSpaceCom to save the TokenBoundary check, since current definition always end with Identifier character
"(?:" + "(" + ClassModifiers + ")" + WhiteSpaceOpt + ")?+" +
TokenBoundary + "class" + WhiteSpaceCom +
// WhiteSpaceOpt is used here, since TypeParameters starts with < and no whitespace is needed to delimit
"(" + Identifier + ")" + WhiteSpaceOpt +
"(?:" + "(" + TypeParameters + ")" + WhiteSpaceOpt + ")?+" +
// As the result, we need to check for boundary before "extends" and "implements"
TokenBoundary + "(?:" + "(" + Super + ")"+ WhiteSpaceOpt + ")?+" +
TokenBoundary + "(?:" + "(" + Interfaces + WhiteSpaceOpt + ")" + ")?+[{]";
// ClassBody is skipped, and only opening bracket { is matched
final static String InterfaceModifier = "(?:public|protected|private|abstract|static|strictfp|" + Annotation + ")";
// Same as ClassModifiers, except that "final" is no longer a valid modifier
final static String InterfaceModifiers = InterfaceModifier + "(?:" + WhiteSpaceOpt + TokenBoundary + InterfaceModifier + ")*+";
final static String ExtendsInterfaces = "extends" + WhiteSpaceCom + InterfaceTypeList;
final static String NormalInterfaceDeclaration =
"(?:" + "(" + InterfaceModifiers + ")" + WhiteSpaceOpt + ")?+" +
TokenBoundary + "interface" + WhiteSpaceCom +
// WhiteSpaceOpt is used here, since TypeParameters starts with < and no whitespace is needed to delimit
"(" + Identifier + ")" + WhiteSpaceOpt +
"(?:" + "(" + TypeParameters + ")" + WhiteSpaceOpt + ")?+" +
// As the result, we need to check for boundary before "extends" here
TokenBoundary + "(?:" + "(" + ExtendsInterfaces + ")" + WhiteSpaceOpt + ")?+[{]";
I am well aware CamelCase and also the fact that some constants differ by pluralization are inappropriate, but it provides a better mapping to the grammar defined in the JLS.
Here is the demo on ideone.