问题
I'm having a problem in writing an algorithm to help me scan a file system and find all subclasses of a certain class.
Details:
I've an app that scans an external application using nio
Files.walk()
while retrieving I check for "extends SuperClass" while reading the file if the word exits, I add the class name in my list as follows:
List<String> subclasses = new ArrayList<>();
Files.walk(appPath)
.filter(p->Files.isRegularFile(p) && p.toString()
.endsWith(".java")).forEach(path -> {
try {
List<String> lines = Files.readAllLines(path);
Pattern pattern = Pattern.compile("\\bextends SuperClass\\b");
Matcher matcher = pattern
.matcher(lines.stream()
.collect(Collectors.joining(" ")));
boolean isChild = matcher.find();
if(isChild) subclasses.add(path.getFileName().toString());
}catch (IOException e){
//handle IOE
}
The problem with the above is that it only gets direct subclasses of SuperClass
but I need to retrieve all direct and indirect subclasses.
I thought about recursion since I've no Idea how many subclasses of SuperClass
there is but I couldn't implement any reasonable implementation.
NOTES:
- Scanning more than 600 thousands file
- I have no Idea how many direct/indirect subclasses of
SuperClass
there is - The application that I'm scanning is external and I can't modify its code so I'm only allowed to access it by reading files and see where
extends
exists - If there is a non-recursive solution to the problem that would be great but if there's no other way, I'll be more than happy to accept a recursive one since I care about the solution more than performance.
Edit:
I use the following regex to compare both name and import to make sure even in case of same name different packages the output is correct:
Pattern pattern = Pattern.compile("("+superClasss.getPackage()+")[\\s\\S]*(\\bextends "+superClass.getName()+"\\b)[\\s\\S]");
I also tried:
Pattern pattern = Pattern.compile("\\bextends "+superClass.getName()+"\\b");
But there is also some missing subclasses, I believe that the code bellow skips some checks, and doesn't fully work:
public static List<SuperClass> getAllSubClasses(Path path, SuperClass parentClass) throws IOException{
classesToDo.add(baseClass);
while(classesToDo.size() > 0) {
SuperClass superClass = classesToDo.remove(0);
List<SuperClass> subclasses = getDirectSubClasses(parentPath,parentClass);
if(subclasses.size() > 0)
classes.addAll(subclasses);
classesToDo.addAll(subclasses);
}
return classes;
}
Any help is truly appreciated!
Edit 2
I also noticed another problem, is that when I detect a subclass
I get the file name currentPath.getFileName()
which might or might not be the subclass name as the subclass may be a nested
or non-public class
in the same file.
回答1:
I strongly recommend parsing compiled class files instead of source code. Since these class files are already optimized for being processed by machines, a lot of the complexity and corner cases of the source code file processing has been eliminated.
So a solution to build a complete class hierarchy tree using the ASM library would look like this:
public static Map<String, Set<String>> getClassHierarchy(Path root) throws IOException {
return Files.walk(root)
.filter(p->Files.isRegularFile(p) && isClass(p.getFileName().toString()))
.map(p -> getClassAndSuper(p))
.collect(Collectors.groupingBy(Map.Entry::getValue,
Collectors.mapping(Map.Entry::getKey, Collectors.toSet())));
}
private static boolean isClass(String fName) {
// skip package-info and module-info
return fName.endsWith(".class") && !fName.endsWith("-info.class");
}
private static Map.Entry<String,String> getClassAndSuper(Path p) {
final class CV extends ClassVisitor {
Map.Entry<String,String> result;
public CV() {
super(Opcodes.ASM5);
}
@Override
public void visit(int version, int access,
String name, String signature, String superName, String[] interfaces) {
result = new AbstractMap.SimpleImmutableEntry<>(
Type.getObjectType(name).getClassName(),
superName!=null? Type.getObjectType(superName).getClassName(): "");
}
}
try {
final CV visitor = new CV();
new ClassReader(Files.readAllBytes(p)).accept(visitor, ClassReader.SKIP_CODE);
return visitor.result;
} catch (IOException ex) {
throw new UncheckedIOException(ex);
}
}
As a bonus, resp. to create some test cases, the following method adds the ability to build the hierarchy for a runtime class’ source:
public static Map<String, Set<String>> getClassHierarchy(Class<?> context)
throws IOException, URISyntaxException {
Path p;
URI clURI = context.getResource(context.getSimpleName()+".class").toURI();
if(clURI.getScheme().equals("jrt")) p = Paths.get(URI.create("jrt:/modules"));
else {
if(!clURI.getScheme().equals("file")) try {
FileSystems.getFileSystem(clURI);
} catch(FileSystemNotFoundException ex) {
FileSystems.newFileSystem(clURI, Collections.emptyMap());
}
String qn = context.getName();
p = Paths.get(clURI).getParent();
for(int ix = qn.indexOf('.'); ix>0; ix = qn.indexOf('.', ix+1)) p = p.getParent();
}
return getClassHierarchy(p);
}
Then, you can do
Map<String, Set<String>> hierarchy = getClassHierarchy(Number.class);
System.out.println("Direct subclasses of "+Number.class);
hierarchy.getOrDefault("java.lang.Number", Collections.emptySet())
.forEach(System.out::println);
and get
Direct subclasses of class java.lang.Number
java.lang.Float
java.math.BigDecimal
java.util.concurrent.atomic.AtomicLong
java.lang.Double
java.lang.Long
java.util.concurrent.atomic.AtomicInteger
java.lang.Short
java.math.BigInteger
java.lang.Byte
java.util.concurrent.atomic.Striped64
java.lang.Integer
or
Map<String, Set<String>> hierarchy = getClassHierarchy(Number.class);
System.out.println("All subclasses of "+Number.class);
printAllClasses(hierarchy, "java.lang.Number", " ");
private static void printAllClasses(
Map<String, Set<String>> hierarchy, String parent, String i) {
hierarchy.getOrDefault(parent, Collections.emptySet())
.forEach(x -> {
System.out.println(i+x);
printAllClasses(hierarchy, x, i+" ");
});
}
to get
All subclasses of class java.lang.Number
java.lang.Float
java.math.BigDecimal
java.util.concurrent.atomic.AtomicLong
java.lang.Double
java.lang.Long
java.util.concurrent.atomic.AtomicInteger
java.lang.Short
java.math.BigInteger
java.lang.Byte
java.util.concurrent.atomic.Striped64
java.util.concurrent.atomic.LongAdder
java.util.concurrent.atomic.LongAccumulator
java.util.concurrent.atomic.DoubleAdder
java.util.concurrent.atomic.DoubleAccumulator
java.lang.Integer
回答2:
DISCLAIMER: This solution might not work if you have several classes with the same name as it does not take packages names into account.
I think you can do it with keeping track of the classes to lookup in a List
and use a while loop until all the values on the list have been explored.
Here is a bit of code which creates a Map<String, List<String>>
, key is the class name, value is the list of child classes.
public class Test {
private static Path appPath = //your path
private static Map<String, List<String>> classes = new HashMap<>();
private static List<String> classesToDo = new ArrayList<>();
public static void main(String[] args) throws IOException {
classesToDo.add("AnswerValueValidatorBase");
while(classesToDo.size() > 0) {
String className = classesToDo.remove(0);
List<String> subclasses = getDirectSubclasses(className);
if(subclasses.size() > 0)
classes.put(className, subclasses);
classesToDo.addAll(subclasses);
}
System.out.println(classes);
}
private static List<String> getDirectSubclasses(String className) throws IOException {
List<String> subclasses = new ArrayList<>();
Files.walk(appPath)
.filter(p -> Files.isRegularFile(p) && p.toString().endsWith(".java"))
.forEach(path -> {
try {
List<String> lines = Files.readAllLines(path);
Pattern pattern = Pattern.compile("\\bextends "+className+"\\b");
Matcher matcher = pattern.matcher(lines.stream().collect(Collectors.joining(" ")));
boolean isChild = matcher.find();
if(isChild) {
String fileName = path.getFileName().toString();
String clazzName = fileName.substring(0, fileName.lastIndexOf("."));
subclasses.add(clazzName);
}
} catch(IOException e) {
//handle IOE
}
});
return subclasses;
}
}
Running it on my project returns something that looks correct
{
AnswerValueValidatorBase=[SingleNumericValidator, DefaultValidator, RatingValidator, ArrayValidatorBase, DocumentValidator],
ArrayValidatorBase=[MultiNumericValidator, StringArrayValidator, IntegerArrayValidator, MultiCheckboxValidator],
DefaultValidator=[IntegerValidator, DateValidator, StringValidator, CountryValidator, PercentageValidator],
IntegerArrayValidator=[MultiPercentageValidator, RankValidator, MultiDropValidator, MultiRadioValidator, CheckboxValidator],
SingleNumericValidator=[SliderValidator],
MultiNumericValidator=[MultiSliderValidator],
StringArrayValidator=[MultiTextValidator, ChecklistValidator]
}
EDIT
A recursive way of doing it would be
public class Test {
private static Path appPath = // your path
public static void main(String[] args) throws IOException {
List<String> classesToDo = new ArrayList<>();
classesToDo.add("AnswerValueValidatorBase");
Map<String, List<String>> classesMap = getSubclasses(new HashMap<>(), classesToDo);
System.out.println(classesMap);
}
private static Map<String, List<String>> getSubclasses(Map<String, List<String>> classesMap, List<String> classesToDo) throws IOException {
if(classesToDo.size() == 0) {
return classesMap;
} else {
String className = classesToDo.remove(0);
List<String> subclasses = getDirectSubclasses(className);
if(subclasses.size() > 0)
classesMap.put(className, subclasses);
classesToDo.addAll(subclasses);
return getSubclasses(classesMap, classesToDo);
}
}
private static List<String> getDirectSubclasses(String className) throws IOException {
// same as above
}
}
来源:https://stackoverflow.com/questions/49896798/find-direct-and-indirect-subclasses-by-scanning-filesystem