I have a set of floating point values that I want to divide into two sets whose size differs at most by one element. Additionally, the difference of value sums between the two sets should be minimal. Optionally, if the number of elements is odd and the sums cannot be equal, the smaller set should have the larger sum.
That would be the optimal solution, but I only really need an exact solution on the subset size constraints. The difference of sums doesn't strictly need to be minimal, but should come close. Also I would prefer if the smaller set (if any) has the larger sum.
I realize this may be related to the partition problem, but it's not quite the same, or as strict.
My current algorithm is the following, though I wonder if there's a way to improve upon that:
arbitrarily divide the set into two sets of the same size (or 1 element size difference)
do
diffOfSums := sum1 - sum2
foundBetter := false
betterDiff := 0.0
foreach pair of elements from set1 and set2 do
if |diffOfSums - 2 * betterDiff| > |diffOfSums - 2 * (value1 - value2)| then
foundBetter := true
betterDiff := value1 - value2
endif
done
if foundBetter then swap the found elements
while foundBetter
My problem with this approach is that I'm not sure of the actual complexity and whether it can be improved upon. It certainly doesn't fulfill the requirement to leave the smaller subset with a larger sum.
Is there any existing algorithm that happens to do what I want to achieve? And if not, can you suggest ways for me to either improve my algorithm or figure out that it may already be reasonably good for the problem?
My suggestion would be to sort the values, then consider each pair of values (v1, v2), (v3, v4) putting one element from each pair into one partition.
The idea is to alternate putting the values into each set, so:
s1 = {v1, v4, v5, v8, . . . }
s2 = {v2, v3, v6, v7, . . . }
If there are an odd number of elements, put the last value into the set that best meets your conditions.
You have a relaxed definition of minimal, so a full search is unnecessary. The above should work quite well for many distributions of the values.
It easy to prove that the partition problem reduces to this problem in polynomial time.
Imagine you want to solve partition for some array A, but you only know how to solve your problem. You just have to double the array length, filling it with zeros. If you can solve it with your algorithm, then you have solved the partition problem. This proves your problem to be NP-hard.
But you'll see you can't reduce this problem to partition (i.e. it isn't NP-complete), unless you limit the precision of your floats. In that case the same algorithm would solve both.
In the general case, the best you can do is backtrack.
来源:https://stackoverflow.com/questions/32157872/divide-set-of-values-into-two-sets-of-same-or-similar-size-with-similar-value-su