I am looking for an efficient way to determine if a set is a subset of another set in Matlab or Mathematica.
Example: Set A = [1 2 3 4] Set B = [4 3] Set C = [3 4 1] Set
Assuming that if no set is a superset of all the supplied sets, you wish to return the empty set. (I.e. if no set is a superset of all sets, return "no thing".)
So, ..., you want to take the union of all the sets, then find the first set in your list with that many elements. This isn't too hard, skipping the reformatting of the input into internal list form... Mathematica:
topSet[a_List] := Module[{pool, biggest, lenBig, i, lenI}, pool = DeleteDuplicates[Flatten[a]]; biggest = {}; lenBig = 0; For[i = 1, i <= Length[a], i++, lenI = Length[a[[i]]]; If[lenI > lenBig, lenBig = lenI; biggest = a[[i]]]; ]; If[lenBig == Length[pool], biggest, {}] ]
For instance:
topSet[{{1,2,3,4},{4,3},{3,4,1},{4,3,2,1}}] {1,2,3,4} topSet[{{4, 3, 2, 1}, {1, 2, 3, 4}, {4, 3}, {3, 4, 1}}] {4,3,2,1} topSet[{{1, 2}, {3, 4}}] {}
As a large test:
<I.e., a set of 1000 randomly selected subsets of the range [1,1000] was analyzed in 14.64 seconds (and, unsurprisingly none of them happened to be a superset of all of them).
-- Edit - Escaped a less than that was hiding a few lines of implementation. Also ...
Run time analysis: Let L be the number of lists, N be the total number of elements in all the lists (including duplicates). The pool assignment takes O(L) for the flattening, and O(N) for the deletion of duplicates. In the for loop, all L assignments to lenI cumulatively require O(N) time and all L conditionals require at most O(L) time. The rest is O(1). Since L
Proof of correctness: A superset, if it exists, (1) contains itself, (2) contains any permutation of itself, (3) contains every element present in any (other) set, (4) is as long or longer than any other set in the collection. Consequences: A superset (if present) is the longest set in the collection, any other set of equal length is a permutation of it, and it contains a copy of every element contained in any set. Therefore, a superset exists if there is a set as large as the union of the collection of sets.