Issue in deleting supersets in Matlab

吃可爱长大的小学妹 提交于 2019-12-12 04:48:33

问题


i've a set of data consisting of sets i want to remove super sets for which subsets are present as follows:

a{1} = [5]
a{2} = [4 11 14]
a{3} = [1]
a{4} = [5 16]
a{5} = [5]
a{6} = [11 16]
a{7} = [11]
a{8} = [16]
a{9} = [9 14 17]
a{10} = [14]

[ii, jj] = ndgrid(1:numel(a));
s = cellfun(@(x,y) all(ismember(x,y)), a(ii), a(jj));
s = triu(s,1); %// count each pair just once, and remove self-pairs
similarity = a(~any(s,1));
celldisp(similarity)

the result is as follows:

a{1} = [5]
a{2} = [4 11 14]
a{3} = [1]
a{4} = [11 16]
a{5} = [11]
a{6} = [16]
a{7} = [9 14 17]
a{8} = [14]

as the output shows there are still supersets that should be removed i.e. a{2} because a{5} contains 11 which is its subset,a{4} should be removed because a{5} contains 11 and a{6} contain 16 as well as a{7} should be deleted too because a{8} contains subset 14.

expected output is

a{1} = [5] 
a{2} = [1]
a{3} = [11]
a{4} = [16]
a{5} = [14]

can anyone help how to fix this code so that i can get accurate set of results. thanks


回答1:


I think you need to use the lower triangular part instead of the upper:

s = tril(s,-1); % instead of s = triu(s,1);

Edit

Keeping the lower triangular part only works when the supersets always occur before the subsets. Here is a general version that should always work fine.

[ii, jj] = ndgrid(1:numel(a));
s = cellfun(@(x,y) all(ismember(x,y)), a(ii), a(jj));
% Set diagonal to zero.
s = s - diag(diag(s));
% Indicator matrix for sets that are exactly equal.
same = s & s';
% For equal sets keep only the first occurence.
keep = triu(same) | ~same.*s;
% Delete supersets.
similarity = a(~any(keep,1));
celldisp(similarity)

By the way, it might be easier to just run a double loop instead of the above matrix operations.



来源:https://stackoverflow.com/questions/29679604/issue-in-deleting-supersets-in-matlab

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!