问题
Practically, I've got a set of objects with probabilities, and I want to look at each possible group of them, in order of how likely it is that they're all true assuming they're independent -- i.e. in descending order of the product of the elements of the subsets -- or in order of length if the probabilities are the same (so that (1, 0.5) comes after (0.5)).
Example: If I have [ 1, 0.5, 0.1 ]
I want [ (), (1), (0.5), (1, 0.5), (0.1), (1, 0.1), (0.5, 0.1), (1, 0.5, 0.1) ]
In essence, this means I want to iterate over the powerset of a set of elements in order, and I could fairly easily generate this, sort it, and be done. However, powersets get pretty big pretty fast, I expect I'm usually going to want one of the first subsets, and I'd rather not generate a list of thousands of subsets, sort them, and then never look past the third. This is where python generators hopefully save the day!
More formal specification of the problem, I need to work out a way to do sorted(powerset(input), key = lambda l : reduce (lambda (p, n), e: (p * e, n-1), l, (1, 0)), reverse=True)
, as a generator, or in some other way that lets me avoid building and sorting the entire list.
I'm reasonably sure this is related to the knapsack problem, along with the subset product problem, but I'm really struggling to get a nice algorithm for it that works, and help would be very much appreciated :-). It's not an issue for it to be slower than building + sorting the whole thing in the worst case (iterating all the way to the end), it just needs much better best case (within the first 10%, say) performance.
回答1:
Nice question, it was quite tricky to solve. I can't think of a way to generate the combinations in order either, but I wield the mighty heapq
(aka a priority queue) to keep the candidates sorted.
from heapq import heappush, heappop
import operator
def prob(ps):
""" returns the probability that *not* all ps are True """
return 1-reduce(operator.mul, ps)
def gen(ps):
# turn each to a tuple
items = ((x,) for x in sorted(ps, reverse=True))
# create a priority queue, sorted by probability
pq = [(prob(x),x) for x in items]
# because you wanted this
yield ()
# as long as there are valid combinations
while pq:
# get the best un-yielded combination, the pq makes sure of that
p, x = heappop(pq)
yield x
# generate all the combinations from this item
for other in ps:
# keeping the tuples sorted -> unique combinations
if other < x[-1]:
# create a new combination
new = x+(other,)
item = prob(new), new
# add it to the queue
heappush(pq,item)
a = [1, 0.1, 0.5]
print list(gen(a))
来源:https://stackoverflow.com/questions/5111395/algorithm-to-get-every-possible-subset-of-a-list-in-order-of-their-product-wit