The Matlab function superclasses returns the names of all parents of a given class.
Is there an equivalent to find all classes derived from a given class, i
Intro:
During the course of the solution I seem to have found an undocumented static method of the
meta.class
class which returns all cached classes (pretty much everything that gets erased when somebody callsclear classes
) and also (entirely by accident) made a tool that checksclassdef
files for errors.
Since we want to find all subclasses, the sure way to go is by making a list of all known classes and then checking for each one if it's derived from any other one. To achieve this we separate our effort into 2 types of classes:
s
(described in the docs of what having the following fields: 'path' 'm' 'mlapp' 'mat' 'mex' 'mdl' 'slx' 'p' 'classes' 'packages'
. We will then select some of them to build a list of classes. To identify what kind of contents an .m or a .p file has (what we care about is class/not-class), we use exist. This method is demonstrated by Loren in her blog. In my code, this is mb_list
.mp_list
.The script has two input flags (includeBulkFiles
,includePackages
) that determine whether each type of classes should be included in the output list.
The full code is below:
function [mc_list,subcls_list] = q37829489(includeBulkFiles,includePackages)
%% Input handling
if nargin < 2 || isempty(includePackages)
includePackages = false;
mp_list = meta.package.empty;
end
if nargin < 1 || isempty(includeBulkFiles)
includeBulkFiles = false;
mb_list = meta.class.empty; %#ok
% `mb_list` is always overwritten by the output of meta.class.getAllClasses;
end
%% Output checking
if nargout < 2
warning('Second output not assigned!');
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Get classes list from bulk files "laying around" the MATLAB path:
if includeBulkFiles
% Obtain MATLAB path:
p = strsplit(path,pathsep).';
if ~ismember(pwd,p)
p = [pwd;p];
end
nPaths = numel(p);
s = what; s = repmat(s,nPaths+20,1); % Preallocation; +20 is to accomodate rare cases
s_pos = 1; % where "what" returns a 2x1 struct.
for ind1 = 1:nPaths
tmp = what(p{ind1});
s(s_pos:s_pos+numel(tmp)-1) = tmp;
s_pos = s_pos + numel(tmp);
end
s(s_pos:end) = []; % truncation of placeholder entries.
clear p nPaths s_pos tmp
%% Generate a list of classes:
% from .m files:
m_files = vertcat(s.m);
% from .p files:
p_files = vertcat(s.p);
% get a list of potential class names:
[~,name,~] = cellfun(@fileparts,[m_files;p_files],'uni',false);
% get listed classes:
listed_classes = s.classes;
% combine all potential class lists into one:
cls_list = vertcat(name,listed_classes);
% test which ones are actually classes:
isClass = cellfun(@(x)exist(x,'class')==8,cls_list); %"exist" method; takes long
%[u,ia,ic] = unique(ext(isClass(1:numel(ext)))); %DEBUG:
% for valid classes, get metaclasses from name; if a classdef contains errors,
% will cause cellfun to print the reason using ErrorHandler.
[~] = cellfun(@meta.class.fromName,cls_list(isClass),'uni',false,'ErrorHandler',...
@(ex,in)meta.class.empty(0*fprintf(1,'The classdef for "%s" contains an error: %s\n'...
, in, ex.message)));
% The result of the last computation used to be assigned into mc_list, but this
% is no longer required as the same information (and more) is returned later
% by calling "mb_list = meta.class.getAllClasses" since these classes are now cached.
clear cls_list isClass ind1 listed_classes m_files p_files name s
end
%% Get class list from classes belonging to packages (takes long!):
if includePackages
% Get a list of all package classes:
mp_list = meta.package.getAllPackages; mp_list = vertcat(mp_list{:});
% see http://www.mathworks.com/help/matlab/ref/meta.package.getallpackages.html
% Recursively flatten package list:
mp_list = flatten_package_list(mp_list);
% Extract classes out of packages:
mp_list = vertcat(mp_list.ClassList);
end
%% Combine lists:
% Get a list of all classes that are in memory:
mb_list = meta.class.getAllClasses;
mc_list = union(vertcat(mb_list{:}), mp_list);
%% Map relations:
try
[subcls_list,discovered_classes] = find_superclass_relations(mc_list);
while ~isempty(discovered_classes)
mc_list = union(mc_list, discovered_classes);
[subcls_list,discovered_classes] = find_superclass_relations(mc_list);
end
catch ex % Turns out this helps....
disp(['Getting classes failed with error: ' ex.message ' Retrying...']);
[mc_list,subcls_list] = q37829489;
end
end
function [subcls_list,discovered_classes] = find_superclass_relations(known_metaclasses)
%% Build hierarchy:
sup_list = {known_metaclasses.SuperclassList}.';
% Count how many superclasses each class has:
n_supers = cellfun(@numel,sup_list);
% Preallocate a Subclasses container:
subcls_list = cell(numel(known_metaclasses),1); % should be meta.MetaData
% Iterate over all classes and
% discovered_classes = meta.class.empty(1,0); % right type, but causes segfault
discovered_classes = meta.class.empty;
for depth = max(n_supers):-1:1
% The function of this top-most loop was initially to build a hierarchy starting
% from the deepest leaves, but due to lack of ideas on "how to take it from here",
% it only serves to save some processing by skipping classes with "no parents".
tmp = known_metaclasses(n_supers == depth);
for ind1 = 1:numel(tmp)
% Fortunately, SuperclassList only shows *DIRECT* supeclasses. Se we
% only need to find the superclasses in the known classees list and add
% the current class to that list.
curr_cls = tmp(ind1);
% It's a shame bsxfun only works for numeric arrays, or else we would employ:
% bsxfun(@eq,mc_list,tmp(ind1).SuperclassList.');
for ind2 = 1:numel(curr_cls.SuperclassList)
pos = find(curr_cls.SuperclassList(ind2) == known_metaclasses,1);
% Did we find the superclass in the known classes list?
if isempty(pos)
discovered_classes(end+1,1) = curr_cls.SuperclassList(ind2); %#ok<AGROW>
% disp([curr_cls.SuperclassList(ind2).Name ' is not a previously known class.']);
continue
end
subcls_list{pos} = [subcls_list{pos} curr_cls];
end
end
end
end
% The full flattened list for MATLAB R2016a contains about 20k classes.
function flattened_list = flatten_package_list(top_level_list)
flattened_list = top_level_list;
for ind1 = 1:numel(top_level_list)
flattened_list = [flattened_list;flatten_package_list(top_level_list(ind1).PackageList)];
end
end
The outputs of this function are 2 vectors, who in Java terms can be thought of as a Map<meta.class, List<meta.class>>
:
mc_list
- an object vector of class meta.class, where each entry contains information about one specific class known to MATLAB. These are the "keys" of our Map
.subcls_list
- A (rather sparse) vector of cells, containing known direct subclasses of the classes appearing in the corresponding position of mc_list
. These are the "values" of our Map
, which are essentially List<meta.class>
.Once we have these two lists, it's only a matter of finding the position of your class-of-interest in mc_list
and getting the list of its subclasses from subcls_list
. If indirect subclasses are required, the same process is repeated for the subclasses too.
Alternatively, one can represent the hierarchy using e.g. a logical
sparse
adjacency matrix, A
, where ai,j==1 signifies that class i
is a subclass of j
. Then the transpose of this matrix can signify the opposite relation, that is, aTi,j==1 means i
is a superclass of j
. Keeping these properties of the adjaceny matrix in mind allows very rapid searches and traversals of the hierarchy (avoiding the need for "expensive" comparisons of meta.class
objects).
Invalid or deleted object.
), in that case re-running it helps. I have added a try/catch
that does this automatically.classdef
s and reports any errors in them - this can be a useful tool to run every once in a while for anyone who works on MATLAB OOP :)To compare these results with Oleg's example, I will use the output of a run of the above script on my computer (containing ~20k classes; uploaded here as a .mat
file). We can then access the class map the following way:
hRoot = meta.class.fromName('sde');
subcls_list{mc_list==hRoot}
ans =
class with properties:
Name: 'sdeddo'
Description: ''
DetailedDescription: ''
Hidden: 0
Sealed: 0
Abstract: 0
Enumeration: 0
ConstructOnLoad: 0
HandleCompatible: 0
InferiorClasses: {0x1 cell}
ContainingPackage: [0x0 meta.package]
PropertyList: [9x1 meta.property]
MethodList: [18x1 meta.method]
EventList: [0x1 meta.event]
EnumerationMemberList: [0x1 meta.EnumeratedValue]
SuperclassList: [1x1 meta.class]
subcls_list{mc_list==subcls_list{mc_list==hRoot}} % simulate recursion
ans =
class with properties:
Name: 'sdeld'
Description: ''
DetailedDescription: ''
Hidden: 0
Sealed: 0
Abstract: 0
Enumeration: 0
ConstructOnLoad: 0
HandleCompatible: 0
InferiorClasses: {0x1 cell}
ContainingPackage: [0x0 meta.package]
PropertyList: [9x1 meta.property]
MethodList: [18x1 meta.method]
EventList: [0x1 meta.event]
EnumerationMemberList: [0x1 meta.EnumeratedValue]
SuperclassList: [1x1 meta.class]
Here we can see that the last output is only 1 class (sdeld
), when we were expecting 3 of them (sdeld
,sdemrd
,heston
) - this means that some classes are missing from this list1.
In contrast, if we check a common parent class such as handle
, we see a completely different picture:
subcls_list{mc_list==meta.class.fromName('handle')}
ans =
1x4059 heterogeneous class (NETInterfaceCustomMetaClass, MetaClassWithPropertyType, MetaClass, ...) array with properties:
Name
Description
DetailedDescription
Hidden
Sealed
Abstract
Enumeration
ConstructOnLoad
HandleCompatible
InferiorClasses
ContainingPackage
PropertyList
MethodList
EventList
EnumerationMemberList
SuperclassList
To conclude this in several words: this method attempts to index all known classes on the MATLAB path. Building the class list/index takes several minutes, but this is a 1-time process that pays off later when the list is searched. It seems to miss some classes, but the found relations are not restricted to the same packages, paths etc. For this reason it inherently supports multiple inheritance.
1 - I currently have no idea what causes this.
I moved the code since it was > 200 lines onto the Github repository getSubclasses. You are welcome to reqeut features and post bug reports.
Given a root class name or a meta.class
and a folder path, it will traverse down the folder structure and build a graph with all the subclasses derived from the root (at infinite depth). If the path is not supplied, then it will recurse down from the folder where the root class is located.
Note, that the solution is local, and that is why it is fast, and relies on the assumption that subclasses are nested under some subfolder of the chosen path.
List all subclasses of the sde
class. You will need R2015b to be able to produce the graph, or you can use the output and the FEX submission plot_graph() to produce a dependency graph.
getSubclasses('sde','C:\Program Files\MATLAB\R2016a\toolbox\finance\finsupport')
And the output with the edges and node names:
names from to
________ ____ __
'sde' 1 1
'bm' 2 3
'sdeld' 3 6
'cev' 4 3
'gbm' 5 4
'sdeddo' 6 1
'heston' 7 6
'cir' 8 9
'sdemrd' 9 6
'hwv' 10 9
We can compare the result with the official documentation, which in this case lists the SDE hierarchy, i.e.
On Win7 64b R2016a
getSubclasses('sde','C:\Program Files\MATLAB\R2016a\toolbox\finance\finsupport')
getSubclasses('sde',matlabroot);
Not a complete solution, but, you could parse all .m
files in the path as text, and use regular expressions to look for the subclass definitions.
Something like ^\s*classdef\s*(\w*)\s*<\s*superClassName\s*(%.*)?
Note that this will fail silently on any subclass definitions that use anything fancy, such as eval
.