Finding the smallest solution set, if one exists (two multipliers)

问题

Note: This is the two-multipliers variation of this problem

Given a set A, consisting of floats between 0.0 and 1.0, find a smallest set B such that for each a in A, there is either a value where a == B[x], or there is a pair of unique values where a == B[x] * B[y].

For example, given

$ A = [0.125, 0.25, 0.5, 0.75, 0.9]

A possible (but probably not smallest) solution for B is

$ B = solve(A)
$ print(B)
[0.25, 0.5, 0.75, 0.9]

This satisfies the initial problem, because A[0] == B[0] * B[1], A[1] == B[1], etc., which allows us to recreate the original set A. The length of B is smaller than that of A, but I’m guessing there are smaller answers as well.

I assume that the solution space for B is large, if not infinite. If a solution exists, how would a smallest set B be found?

Notes:

We're not necessarily limited to the items in A. B can consist of any set of values, whether or not they exist in A.
Since items in A are all 0-1 floats, I'm assuming that B will also be 0-1 floats. Is this the case?
This may be a constraint satisfaction problem, but I'm not sure how it would be defined?
Since floating point math is generally problematic, any answer should frame the algorithm around rational numbers.

回答1:

Sort the array. For each pair of elements Am, An ∈ A, m < n - calculate their ratio.

Check if the ratio is equal to some element in A, which is not equal to Am nor to An.

Example:

A = { 0.125, 0.25, 0.5, 0.75, 0.9 }

(0.125, 0.25): 0.5    <--- bingo
(0.125, 0.5 ): 0.25   <--- bingo
(0.125, 0.75): 0.1(6)
(0.125, 0.9 ): 0.13(8)
(0.25 , 0.5 ): 0.5
(0.25 , 0.75): 0.(3)
(0.25 , 0.9 ): 0.2(7)
(0.5  , 0.75): 0.(6)
(0.5  , 0.9 ): 0.(5) 
(0.75 , 0.9 ): 0.8(3)

The numerator (0.125) is redundant (= 0.25 * 0.5) or (= 0.5 * 0.25)

We can do better by introducing new elements:

Another example:

A = { 0.1, 0.11, 0.12, 0.2, 0.22, 0.24 }

(0.1 , 0.11): 0.(90)        ***
(0.1 , 0.12): 0.8(3)        +++
(0.1 , 0.2 ): 0.5     <--------
(0.1 , 0.22): 0.(45)
(0.1 , 0.24): 0.41(6)
(0.11, 0,12): 0.91(6)       ~~~
(0.11, 0.2 ): 0.55
(0.11, 0.22): 0.5     <--------
(0.11, 0.24): 0.458(3)
(0.12, 0.2 ): 0.6
(0.12, 0.22): 0.(54)
(0.12, 0.24): 0.5     <--------
(0.2 , 0.22): 0.(90)        ***
(0.2 , 0.24): 0.8(3)        +++
(0.22. 0.24): 0.91(6)       ~~~

Any 2 or more pairs (a1,a2), (a3,a4), (... , ...) with a common ratio f can be replaced with { a1, a3, ..., f }.

Hence adding 0.5 to our set makes { 0.1, 0.11, 0.12 } redundant.

B = (0.2, 0.22, 0.24, 0.5}

We are now (i the general case) left with an optimization problem of selecting which of these elements to remove and which of these factors to add in order to minimize the cardinality of B (which I leave as an exercise to the reader).

Note that there is no need to introduce numbers greater than 1. B can also be represented as { 0.1, 0.11, 0.12, 2} but this set has the same cardinality.

回答2:

Google's OR-Tools provide a nice CP solver which can be used to get solutions to this. You can encode your problem as a simple set of boolean variables, saying which variables or combinations of variables are valid.

I start by pulling in the relevant part of the library and setting up a few variables:

from ortools.sat.python import cp_model

A = [0.125, 0.25, 0.5, 0.75, 0.9]
# A = [0.1, 0.11, 0.12, 0.2, 0.22, 0.24]

model = cp_model.CpModel()

we can then define a few helper functions for creating variables from our numbers:

vars = {}
def get_var(val):
    assert val >= 0 and val <= 1
    if val in vars:
        return vars[val]

    var = model.NewBoolVar(str(val))
    vars[val] = var
    return var

pairs = {}
def get_pair(pair):
    if pair in pairs:
        return pairs[pair]

    a, b = pair
    av = get_var(a)
    bv = get_var(b)

    var = model.NewBoolVar(f'[{a} * {b}]')
    model.AddBoolOr([av.Not(), bv.Not(), var])
    model.AddImplication(var, av)
    model.AddImplication(var, bv)
    pairs[pair] = var
    return var

i.e. get_var(0.5) will create a boolean variable (with Name='0.5'), while get_pair(0.5, 0.8) will create a variable and set constraints so that it's only true when 0.5 and 0.8 are also true. there's a useful document on encoding boolean logic in ortools

then we can go through A figuring out what combinations are valid and adding them as constraints to the solver:

for i, a in enumerate(A):
    opts = {(a,)}
    for a2 in A[i+1:]:
        assert a < a2
        m = a / a2
        if m == a2:
            opts.add((m,))
        elif m < a2:
            opts.add((m, a2))
        else:
            opts.add((a2, m))

    alts = []
    for opt in opts:
        if len(opt) == 1:
            alts.append(get_var(*opt))
        else:
            alts.append(get_pair(opt))

    model.AddBoolOr(alts)

next we need a way of saying that we prefer variables to be false rather than true. the minimal version of this is:

model.Minimize(sum(vars.values()))

but we get much nicer results if we complicate this a bit and put a preference on values that were in A:

costsum = 0
for val, var in vars.items():
    cost = 1000 if val in A else 1001
    costsum += var * cost
model.Minimize(costsum)

finally, we can run our solver and print out a solution:

solver = cp_model.CpSolver()
status = solver.Solve(model)
print(solver.StatusName(status))

if status in {cp_model.FEASIBLE, cp_model.OPTIMAL}:
    B = [val for val, var in vars.items() if solver.Value(var)]
    print(sorted(B))

this gives me back the expected sets of: [0.125, 0.5, 0.75, 0.9] and [0.2, 0.22, 0.24, 0.5] for the two examples at the top

you could also encode the fact that you only consider solutions valid if |B| < |A| in the solver, but I'd be tempted to do that outside

来源：https://stackoverflow.com/questions/56904806/finding-the-smallest-solution-set-if-one-exists-two-multipliers

标签

python

algorithm

set

constraints

set-theory