An efficient code to determine if a set is a subset of another set

前端 未结 4 2366
悲哀的现实
悲哀的现实 2021-02-20 04:27

I am looking for an efficient way to determine if a set is a subset of another set in Matlab or Mathematica.

Example: Set A = [1 2 3 4] Set B = [4 3] Set C = [3 4 1] Set

4条回答
  •  情深已故
    2021-02-20 04:50

    You will likely want to take a look at the built-in set operation functions in MATLAB. Why reinvent the wheel if you don't have to? ;)

    HINT: The ISMEMBER function may be of particular interest to you.

    EDIT:

    Here's one way you can approach this problem using nested loops, but setting them up to try and reduce the number of potential iterations. First, we can use the suggestion in Marc's comment to sort the list of sets by their number of elements so that they are arranged largest to smallest:

    setList = {[1 2 3 4],...  %# All your sets, stored in one cell array
               [4 3],...
               [3 4 1],...
               [4 3 2 1]};
    nSets = numel(setList);                       %# Get the number of sets
    setSizes = cellfun(@numel,setList);           %# Get the size of each set
    [temp,sortIndex] = sort(setSizes,'descend');  %# Get the sort index
    setList = setList(sortIndex);                 %# Sort the sets
    

    Now we can set up our loops to start with the smallest sets at the end of the list and compare them first to the largest sets at the start of the list to increase the odds we will find a superset quickly (i.e. we're banking on larger sets being more likely to contain smaller sets). When a superset is found, we remove the subset from the list and break the inner loop:

    for outerLoop = nSets:-1:2
      for innerLoop = 1:(outerLoop-1)
        if all(ismember(setList{outerLoop},setList{innerLoop}))
          setList(outerLoop) = [];
          break;
        end
      end
    end
    

    After running the above code, setList will have all sets removed from it that are either subsets or duplicates of other sets preceding them in the list.

    In the best case scenario (e.g. the sample data in your question) the inner loop breaks after the first iteration every time, performing only nSets-1 set comparisons using ISMEMBER. In the worst case scenario the inner loop never breaks and it will perform (nSets-1)*nSets/2 set comparisons.

提交回复
热议问题