Are there any 'tricks' to speed up sampling of a very large knapsack combination type prob?

前端 未结 9 1942
情话喂你
情话喂你 2020-12-29 10:21

UPDATE: I have realized the problem below is not possible to answer in its current form because of the large amount of data involved(15k+ items). I just found out, the

相关标签:
9条回答
  • 2020-12-29 10:49

    To solve this with dynamic programming, all the costs need to be non-negative integers, and you need an array as long as the total cost you are trying to achieve - each element of the array corresponds to solutions for the cost represented by its offset in the array. Since you want all solutions, each element of the array should be a list of last components of a solution. You can reduce the size of this list by requiring that the last component of a solution cost at least as much as any other component of the solution.

    Given this, once you have filled in the array up to length N, you fill entry N+1 by considering every possible item at each of its 100 multiplicities. For each such item you subtract (multiplicity times cost) from N+1 and see that to get a total cost of N+1 you can use this item plus any solution for cost N+1-thisCost. So you look in the array - back at an entry you have already filled in - to see if there is a solution for N+1-thisCost and, if so, and if the current cost*multiplicity is at least as high as some item in array[N+1-thisCost], you can add an entry for item,multiplicity at offset N+1.

    Once you have the array extended to whatever your target cost is, you can work backwords from array[finalCost], looking at the answers there and subtracting off their cost to find out what array[finalCost - costOfAnswerHere] to look at to find the full solution.

    This solution doesn't have an obvious parallel version, but sometimes the speedups with dynamic programming are good enough that it might still be faster - in this case a lot depends on how large the final cost is.

    This is a bit different from normal dynamic programming because you want every answer - hopefully it will still give you some sort of advantage. Come to think of it, it might be better to simply have a possible/impossible flag in the array saying whether or not there is a solution for that array's offset, and then repeat the check for possible combinations when you trace back.

    0 讨论(0)
  • 2020-12-29 10:53

    The problem you are trying to solve is called number partitioning. It is a special case of the knapsack problem. If the values are all integers and you are trying to get to value M, then then you can find a single solution in O(n*M) time. To enumerate all combinations could be exponential because there are potentially an exponential number of solutions.

    0 讨论(0)
  • 2020-12-29 10:54

    I may be wrong but could this be viewed as a integer partition (was Triangle number, but I remembered I misremembered ) problem?

    If that's the case, each entry would have a membership to a set of results for a given sum. Precalculating and caching the membership results for a given sum (yes a huge table) could form a very quick solution.

    I could probably do it, but I'd need a large data set. Interesting though...

    Consult the guru Knuth http://cs.utsa.edu/~wagner/knuth/fasc3b.pdf

    0 讨论(0)
提交回复
热议问题