Algorithm for finding smallest collection of components

后端 未结 3 1293
独厮守ぢ
独厮守ぢ 2021-02-15 12:57

I\'m looking for an algorithm to solve the following problem. I have a number of subsets (1-n) of a given set (a-h). I want to find the smallest collection of subsets that will

相关标签:
3条回答
  • 2021-02-15 13:21

    This problem is known as Set Basis, and it is NP-complete (Larry J. Stockmeyer: The set basis problem is NP-complete. Technical Report RC-5431, IBM, 1975). Its formulation as a graph problem is Bipartite Dimension. Since it is very hard to solve in general, it might be useful to look if there are any helpful properties of your data (e.g., are the sets small? Is the solution small? Can all sets occur?)

    I cannot think of an easy ILP formulation. Instead, you could try to reduce the problem to Clique Cover, which is better studied, using either the reduction from Kou&Wong or the one from Nor et al.. I have coauthered a paper discussing algorithms for Clique Cover, and written a Clique cover solver with both an exact solver and two heuristics.

    0 讨论(0)
  • 2021-02-15 13:26

    This problem was shown in one the video's of Coursera's Discrete Optimization lectures. IIRC, it's called the set cover problem.

    IIRC, it's NP-complete or NP-hard, so look into the typical algorithms (exact algo's for small datasets, metaheuristics for medium/big datasets) and typical frameworks (OptaPlanner, ...)

    0 讨论(0)
  • 2021-02-15 13:26

    For this variant of the Set Cover problem, here is an Integer Programming formulation approach, with row generation.

    Let's denote the components a,b,c,d... by their Column number. a=1, b=2 etc.

    The rows are 'subsets.' Let's say that the EXISTING subsets are S1,...Sm. (These are the ones that HAVE to be covered.)

    Notation for NEW subsets

    This is the step where we introduce NEW subsets. Let's call the 'atomic' subsets as a_x. All a subsets have only one component.

       a1 is the subset {1,0,0,0}
       a2 is the subset {0,1,0,0}
       a3 is the subset {1,0,1,0}
       ...
    

    Let bxy be subsets with two components.

    So `b13` is the subset with component 1 and 3 being present.
    b13 = {1, 0, 1, 0}
    b34 = {0, 0, 1, 1} etc.
    
    cxyz are subsets with three components.
    For example, c124 = { 1, 1, 0, 1} etc.
    
    d subsets will have 4 components
    e subsets will have 5 components 
    and so on.
    

    Row Generation Step

    Given an EXISTING Set, we generate only the needed NEW a, b, c ... subsets as we need.

    For example, let's take the subset S1 = {1, 0, 1, 1}
    Meaningful sets needed that can help create S1 are
    a1, a3, a4. (Note that a2 is not needed since component b is not a component in S1}
    b11, b13, b34.
    c134
    

    PREPROCESSING STEP: For each Sj in EXISTING SETS, generate new sub sets, using the procedure mentioned above. We create only as many ax, bxy, cxyz dxyzw... as needed. This step is needed before the formulation step.

    In the worst case, there are (2^num_components-1) subsets needed per Sj. But they are easy to generate.

    Example Problem

    Now the formulation for the following problem:

      a b c d
    1 1 1 1  
    2 1 1 1 
    3 1 1 1
    4     1 1
    5 1 1 1 1
    

    We will have one constraint for each ROW. Each set has to be "covered"

    For the problem above, here's the formulation

    Formulation

    Objective Minimize sum of all Subsets.
     Min sum (a_x) + sum (b_xy) + sum (c_xyz) + sum (d_xyzw)
    
    Subject to:
    
       a1 + a2 + a3 + b11 + b12 + b13 + c123  >= 1 \\ Set 1 has to be formed
       a1 + a2 + a3 + b11 + b12 + b13 + c123  >= 1 \\ Set 2 has to be formed
       a1 + a2 + a3 + b11 + b12 + b13 + c123  >= 1 \\ Set 3 has to be formed
       a4 + a5            + b34               >= 1 \\ Set 4 has to be formed
       a1 + a2 + a3 + a4 + b11 + b12 + ..+  b34 + c123 + ...+ d1234  >= 1 \\ Set 5 has to be formed
    
     a's, b's, c's, d's Binary
    

    Upper bound: By exploiting the fact that you need at most j subsets (Number of existing Subsets) you can even add a cut. Objective function has to be j or lower.

    Hope that helps.

    0 讨论(0)
提交回复
热议问题