Is there a way to find all children of a Matlab class?

后端 未结 3 1606
孤城傲影
孤城傲影 2021-02-19 23:04

The Matlab function superclasses returns the names of all parents of a given class.

Is there an equivalent to find all classes derived from a given class, i

相关标签:
3条回答
  • 2021-02-19 23:13

    Intro:

    During the course of the solution I seem to have found an undocumented static method of the meta.class class which returns all cached classes (pretty much everything that gets erased when somebody calls clear classes) and also (entirely by accident) made a tool that checks classdef files for errors.


    Since we want to find all subclasses, the sure way to go is by making a list of all known classes and then checking for each one if it's derived from any other one. To achieve this we separate our effort into 2 types of classes:

    • "Bulk classes" - here we employ the what function to make a list of files that are just "laying around" on the MATLAB path, which outputs a structure s (described in the docs of what having the following fields: 'path' 'm' 'mlapp' 'mat' 'mex' 'mdl' 'slx' 'p' 'classes' 'packages'. We will then select some of them to build a list of classes. To identify what kind of contents an .m or a .p file has (what we care about is class/not-class), we use exist. This method is demonstrated by Loren in her blog. In my code, this is mb_list.
    • "Package classes" - this includes class files that are indexed by MATLAB as part of its internal package structure. The algorithm involved in getting this list involves calling meta.package.getAllPackages and then recursively traversing this top-level package list to get all sub-packages. Then a class list is extracted from each package, and all lists are concatenated into one long list - mp_list.

    The script has two input flags (includeBulkFiles,includePackages) that determine whether each type of classes should be included in the output list.

    The full code is below:

    function [mc_list,subcls_list] = q37829489(includeBulkFiles,includePackages)
    %% Input handling
    if nargin < 2 || isempty(includePackages)
      includePackages = false;
      mp_list = meta.package.empty;
    end
    if nargin < 1 || isempty(includeBulkFiles)
      includeBulkFiles = false;
      mb_list = meta.class.empty; %#ok
      % `mb_list` is always overwritten by the output of meta.class.getAllClasses; 
    end
    %% Output checking
    if nargout < 2
      warning('Second output not assigned!');
    end
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    %% Get classes list from bulk files "laying around" the MATLAB path:
    if includeBulkFiles
      % Obtain MATLAB path:
      p = strsplit(path,pathsep).';
      if ~ismember(pwd,p)
        p = [pwd;p];
      end
      nPaths = numel(p);
      s = what; s = repmat(s,nPaths+20,1); % Preallocation; +20 is to accomodate rare cases 
      s_pos = 1;                           %  where "what" returns a 2x1 struct.
      for ind1 = 1:nPaths  
        tmp = what(p{ind1});
        s(s_pos:s_pos+numel(tmp)-1) = tmp;
        s_pos = s_pos + numel(tmp);
      end
      s(s_pos:end) = []; % truncation of placeholder entries.
      clear p nPaths s_pos tmp
      %% Generate a list of classes:
      % from .m files:
      m_files = vertcat(s.m);
      % from .p files:
      p_files = vertcat(s.p);
      % get a list of potential class names:
      [~,name,~] = cellfun(@fileparts,[m_files;p_files],'uni',false);
      % get listed classes:
      listed_classes = s.classes;
      % combine all potential class lists into one:
      cls_list = vertcat(name,listed_classes);
      % test which ones are actually classes:
      isClass = cellfun(@(x)exist(x,'class')==8,cls_list); %"exist" method; takes long
      %[u,ia,ic] = unique(ext(isClass(1:numel(ext)))); %DEBUG:
    
      % for valid classes, get metaclasses from name; if a classdef contains errors,
      % will cause cellfun to print the reason using ErrorHandler.
      [~] = cellfun(@meta.class.fromName,cls_list(isClass),'uni',false,'ErrorHandler',...
         @(ex,in)meta.class.empty(0*fprintf(1,'The classdef for "%s" contains an error: %s\n'...
                                             , in, ex.message)));
      % The result of the last computation used to be assigned into mc_list, but this 
      % is no longer required as the same information (and more) is returned later
      % by calling "mb_list = meta.class.getAllClasses" since these classes are now cached.
      clear cls_list isClass ind1 listed_classes m_files p_files name s
    end
    %% Get class list from classes belonging to packages (takes long!):
    
    if includePackages
      % Get a list of all package classes:
      mp_list = meta.package.getAllPackages; mp_list = vertcat(mp_list{:});  
      % see http://www.mathworks.com/help/matlab/ref/meta.package.getallpackages.html
    
      % Recursively flatten package list:
      mp_list = flatten_package_list(mp_list);
    
      % Extract classes out of packages:
      mp_list = vertcat(mp_list.ClassList);
    end
    %% Combine lists:
    % Get a list of all classes that are in memory:
    mb_list = meta.class.getAllClasses; 
    mc_list = union(vertcat(mb_list{:}), mp_list);
    %% Map relations:
    try
      [subcls_list,discovered_classes] = find_superclass_relations(mc_list);
      while ~isempty(discovered_classes)
        mc_list = union(mc_list, discovered_classes);
        [subcls_list,discovered_classes] = find_superclass_relations(mc_list);
      end
    catch ex % Turns out this helps....
      disp(['Getting classes failed with error: ' ex.message ' Retrying...']);
      [mc_list,subcls_list] = q37829489;
    end
    
    end
    
    function [subcls_list,discovered_classes] = find_superclass_relations(known_metaclasses)
    %% Build hierarchy:
    sup_list = {known_metaclasses.SuperclassList}.';
    % Count how many superclasses each class has:
    n_supers = cellfun(@numel,sup_list);
    % Preallocate a Subclasses container: 
    subcls_list = cell(numel(known_metaclasses),1); % should be meta.MetaData
    % Iterate over all classes and 
    % discovered_classes = meta.class.empty(1,0); % right type, but causes segfault
    discovered_classes = meta.class.empty;
    for depth = max(n_supers):-1:1
      % The function of this top-most loop was initially to build a hierarchy starting 
      % from the deepest leaves, but due to lack of ideas on "how to take it from here",
      % it only serves to save some processing by skipping classes with "no parents".
      tmp = known_metaclasses(n_supers == depth);
      for ind1 = 1:numel(tmp)
        % Fortunately, SuperclassList only shows *DIRECT* supeclasses. Se we
        % only need to find the superclasses in the known classees list and add
        % the current class to that list.
        curr_cls = tmp(ind1);
        % It's a shame bsxfun only works for numeric arrays, or else we would employ: 
        % bsxfun(@eq,mc_list,tmp(ind1).SuperclassList.');
        for ind2 = 1:numel(curr_cls.SuperclassList)
          pos = find(curr_cls.SuperclassList(ind2) == known_metaclasses,1);
          % Did we find the superclass in the known classes list?
          if isempty(pos)
            discovered_classes(end+1,1) = curr_cls.SuperclassList(ind2); %#ok<AGROW>
      %       disp([curr_cls.SuperclassList(ind2).Name ' is not a previously known class.']);
            continue
          end      
          subcls_list{pos} = [subcls_list{pos} curr_cls];
        end    
      end  
    end
    end
    
    % The full flattened list for MATLAB R2016a contains about 20k classes.
    function flattened_list = flatten_package_list(top_level_list)
      flattened_list = top_level_list;
      for ind1 = 1:numel(top_level_list)
        flattened_list = [flattened_list;flatten_package_list(top_level_list(ind1).PackageList)];
      end
    end
    

    The outputs of this function are 2 vectors, who in Java terms can be thought of as a Map<meta.class, List<meta.class>>:

    • mc_list - an object vector of class meta.class, where each entry contains information about one specific class known to MATLAB. These are the "keys" of our Map.
    • subcls_list - A (rather sparse) vector of cells, containing known direct subclasses of the classes appearing in the corresponding position of mc_list. These are the "values" of our Map, which are essentially List<meta.class>.

    Once we have these two lists, it's only a matter of finding the position of your class-of-interest in mc_list and getting the list of its subclasses from subcls_list. If indirect subclasses are required, the same process is repeated for the subclasses too.

    Alternatively, one can represent the hierarchy using e.g. a logical sparse adjacency matrix, A, where ai,j==1 signifies that class i is a subclass of j. Then the transpose of this matrix can signify the opposite relation, that is, aTi,j==1 means i is a superclass of j. Keeping these properties of the adjaceny matrix in mind allows very rapid searches and traversals of the hierarchy (avoiding the need for "expensive" comparisons of meta.class objects).

    Several notes:

    • For reasons unknown (caching?) the code may fail due to an error (e.g. Invalid or deleted object.), in that case re-running it helps. I have added a try/catch that does this automatically.
    • There are 2 instances in the code where arrays are grown inside a loop. This is of course unwanted and should be avoided. The code was left like that due to a lack of better ideas.
    • If the the "discovery" part of the algorithm cannot be avoided (by somehow finding all the classes in the first place), one can (and should) optimize it so that every iteration only operates on previously unknown classes.
    • An interesting unintended benefit of running this code is that it scans all known classdefs and reports any errors in them - this can be a useful tool to run every once in a while for anyone who works on MATLAB OOP :)
    • Thanks @Suever for some helpful pointers.

    Comparison with Oleg's method:

    To compare these results with Oleg's example, I will use the output of a run of the above script on my computer (containing ~20k classes; uploaded here as a .mat file). We can then access the class map the following way:

    hRoot = meta.class.fromName('sde');
    subcls_list{mc_list==hRoot}
    
    ans = 
    
      class with properties:
    
                         Name: 'sdeddo'
                  Description: ''
          DetailedDescription: ''
                       Hidden: 0
                       Sealed: 0
                     Abstract: 0
                  Enumeration: 0
              ConstructOnLoad: 0
             HandleCompatible: 0
              InferiorClasses: {0x1 cell}
            ContainingPackage: [0x0 meta.package]
                 PropertyList: [9x1 meta.property]
                   MethodList: [18x1 meta.method]
                    EventList: [0x1 meta.event]
        EnumerationMemberList: [0x1 meta.EnumeratedValue]
               SuperclassList: [1x1 meta.class]
    
    subcls_list{mc_list==subcls_list{mc_list==hRoot}} % simulate recursion
    
    ans = 
    
      class with properties:
    
                         Name: 'sdeld'
                  Description: ''
          DetailedDescription: ''
                       Hidden: 0
                       Sealed: 0
                     Abstract: 0
                  Enumeration: 0
              ConstructOnLoad: 0
             HandleCompatible: 0
              InferiorClasses: {0x1 cell}
            ContainingPackage: [0x0 meta.package]
                 PropertyList: [9x1 meta.property]
                   MethodList: [18x1 meta.method]
                    EventList: [0x1 meta.event]
        EnumerationMemberList: [0x1 meta.EnumeratedValue]
               SuperclassList: [1x1 meta.class]
    

    Here we can see that the last output is only 1 class (sdeld), when we were expecting 3 of them (sdeld,sdemrd,heston) - this means that some classes are missing from this list1.

    In contrast, if we check a common parent class such as handle, we see a completely different picture:

    subcls_list{mc_list==meta.class.fromName('handle')}
    
    ans = 
    
      1x4059 heterogeneous class (NETInterfaceCustomMetaClass, MetaClassWithPropertyType, MetaClass, ...) array with properties:
    
        Name
        Description
        DetailedDescription
        Hidden
        Sealed
        Abstract
        Enumeration
        ConstructOnLoad
        HandleCompatible
        InferiorClasses
        ContainingPackage
        PropertyList
        MethodList
        EventList
        EnumerationMemberList
        SuperclassList
    

    To conclude this in several words: this method attempts to index all known classes on the MATLAB path. Building the class list/index takes several minutes, but this is a 1-time process that pays off later when the list is searched. It seems to miss some classes, but the found relations are not restricted to the same packages, paths etc. For this reason it inherently supports multiple inheritance.


    1 - I currently have no idea what causes this.

    0 讨论(0)
  • 2021-02-19 23:32

    The code

    I moved the code since it was > 200 lines onto the Github repository getSubclasses. You are welcome to reqeut features and post bug reports.

    Idea

    Given a root class name or a meta.class and a folder path, it will traverse down the folder structure and build a graph with all the subclasses derived from the root (at infinite depth). If the path is not supplied, then it will recurse down from the folder where the root class is located.

    Note, that the solution is local, and that is why it is fast, and relies on the assumption that subclasses are nested under some subfolder of the chosen path.

    Example

    List all subclasses of the sde class. You will need R2015b to be able to produce the graph, or you can use the output and the FEX submission plot_graph() to produce a dependency graph.

    getSubclasses('sde','C:\Program Files\MATLAB\R2016a\toolbox\finance\finsupport')

    And the output with the edges and node names:

     names      from    to
    ________    ____    __
    'sde'        1      1 
    'bm'         2      3 
    'sdeld'      3      6 
    'cev'        4      3 
    'gbm'        5      4 
    'sdeddo'     6      1 
    'heston'     7      6 
    'cir'        8      9 
    'sdemrd'     9      6 
    'hwv'       10      9 
    

    We can compare the result with the official documentation, which in this case lists the SDE hierarchy, i.e.

    Timing

    On Win7 64b R2016a

    • less than 0.1 seconds: getSubclasses('sde','C:\Program Files\MATLAB\R2016a\toolbox\finance\finsupport')
    • about 13 seconds if scanning the whole matlabroot: getSubclasses('sde',matlabroot);
    0 讨论(0)
  • 2021-02-19 23:32

    Not a complete solution, but, you could parse all .m files in the path as text, and use regular expressions to look for the subclass definitions.

    Something like ^\s*classdef\s*(\w*)\s*<\s*superClassName\s*(%.*)?

    Note that this will fail silently on any subclass definitions that use anything fancy, such as eval.

    0 讨论(0)
提交回复
热议问题