问题
I've asked this on math overflow and used comments to clarify/overstate my question. I hope it has the intended effect and doesn't come off as jarring.
I'm attempting to find what subset of numbers reach a known average.
I have a list of known values, negative and possible decimals. They look something like this {-.32,-.64,-.12,.08,-.54,-.43, ...}
It's around 50 numbers in some cases, though this problem would be tested for other cases as well.
The set mostly contains negative decimal numbers, while in rare cases, has a few positive decimals - it never has whole numbers.
I also have a known value, which I know to be the average of some subset of the above list.
The known value is similar to -.03.
I'm uncertain of the grouping mechanism used, but seemed to reach stack overflow trying to solve this problem when not grouping.
I've attempted a few ways of going about solving this problem. I'm using Python 3.6 and imported numpy as np.
I'm wondering if the "subset-avg" code I've adapted from another solution for subset-sum here (I'll give due credit when i can find that question again) is not the most efficient way/if there's any huge mistake in my even attempting to resolve this that I haven't seen.
Thanks in advance for any thoughts.
def subset_avg(numbers, target, partial=[],depth=1):
# create AVG function
# set average of partial
a = np.mean(partial)
# check if the partial sum is equals to target
if a != target:
print("Currently Testing the Following Subset's " " " + "Average(%s) = %s\n\n" % (partial, round(a,2)))
print(depth)
if a == target or round(a,2) == target:
print('\n\n')
print("************")
print("************")
print('\n\n')
print("Found Subset AVG " + "Average(%s) = %s" % (partial, target))
print('\n\n')
print("************")
print("************")
print('\n\n')
print(depth)
# for each number in range of list
for i in range(len(numbers)):
# set n = current iteration in list
n = numbers[i]
# remaining values is current iteration + 1 through end of list
remaining = numbers[i+1:]
# calculate mean of partial, set partial = partial plus n
subset_avg(remaining, target, partial + [n],depth+1)
# Example of use
x = [-.32,-.64,-.12,.08,-.54,-.43]
subset_avg(x,-.03)
回答1:
Here is a solution I adapted from a subSet sum algorithm I posted for another question (here). Since the algorithm loops through potential solution sizes, it was easy to adapt it to search for an average.
The iSubSum()
function takes 3 parameters: The target average, the list of values and an optional rounding precision parameter. It is a generator so it will produce all possible solutions when used in a loop. You can also get the first solution quickly using the next()
function. This should produce results much faster than a brute force approach, especially for large lists.
The function is based on a modified version of a subset-sum algorithm that returns solutions as lists of indices. This is intended to distinguish combinations that have duplicate values coming from different indices in the original list.
from bisect import bisect_right
from itertools import accumulate
def iSubAverage(M,A,P=0):
smallSize = 20
smallSums = set()
def subSumForSize(S,A,size,failedSums=None):
nextSum = A[size-2][2] if size>1 else 0
index = bisect_right([a for a,_,_ in A],S-nextSum) # max element for target
A = A[:index]
if len(A) < size: return # not enough elements for size
if A[size-1][2] > S: return # minimum sum > target
maxSum = A[-1][2]
if len(A) > size: maxSum -= A[-size-1][2]
if maxSum < S: return # maximum sum < target
if len(A) <= smallSize and S not in smallSums: return
if failedSums is None: failedSums = set()
while index >= size:
index -= 1
a,i,ca = A[index]
if size == 1:
if a == S: yield [i]
continue
c0 = A[index-size][2] if index>size else 0
if ca-c0 < S: break
subS = S-a
if subS in failedSums: continue # known unreachable sum
failed = True
for result in subSumForSize(subS,A[:index],size-1,failedSums):
yield result+[i]
failed = False
if failed: failedSums.add(subS)
if not A: return
if M < 0: M,A = -M,[-a for a in A] # must have positive target
offset = max(0,-min(A)) # circumvent negatives (requires loop on sizes)
A = sorted([(round(a+offset,P),i) for i,a in enumerate(A)])
cumA = accumulate(a for a,i in A)
A = [(a,i,ca) for (a,i),ca in zip(A,cumA)]
for a,_,_ in A[:smallSize]:
newSums = [a+s for s in smallSums] + [a]
smallSums.update(newSums)
for size in range(1,len(A)+1):
subS = round(M*size,P)
if subS != round(M*size,P*2): continue # fractional numerator
subS += round(offset*size,P)
for result in subSumForSize(subS,A,size):
yield result
To get the actual values, the iSubAvg()
function maps indices to the corresponding values in the list:
def iSubAvg(M,A,P=0):
for iA in iSubAverage(M,A,P):
yield sorted([A[i] for i in iA])
L = [-.32,-.64,-.12,.08,-.54,-.43]
targetL = -0.02
for solution in iSubAvg(targetL,L,2):
print(solution)
# [-0.12, 0.08] (there isn't a solution for -0.03)
K = [0.72, 0.69, 0.81, -0.28, 0.6, 0.59, 0.77, 0.46, 0.36, 0.66, 0.88, 0.88, 0.9, -0.24, 0.5, -0.5, 0.46, 0.96, -0.22, -0.8, -0.13, 0.87, 0.78, 0.2]
targetK = -0.02
for solution in iSubAvg(targetK,K,2):
print(solution)
# [-0.5, 0.46]
# [-0.5, 0.46]
# [-0.8, -0.22, 0.96]
# [-0.5, -0.28, 0.72]
# [-0.28, -0.24, 0.46]
# [-0.28, -0.24, 0.46]
# [-0.5, -0.24, 0.2, 0.46]
# [-0.5, -0.24, 0.2, 0.46]
# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]
# [-0.8, -0.28, -0.24, -0.22, 0.46, 0.96]
next(iSubAvg(0.165,K,2))
# [-0.8, -0.28, -0.24, 0.66, 0.69, 0.96]
note that the function returns all combinations including repetitions for duplicate values in the source list. You can filter out these duplicates if you don't need them
来源:https://stackoverflow.com/questions/55857874/subset-avg-finding-a-subset-of-list-which-matches-known-rational-number