Are there any 'tricks' to speed up sampling of a very large knapsack combination type prob?

前端 未结 9 1935
情话喂你
情话喂你 2020-12-29 10:21

UPDATE: I have realized the problem below is not possible to answer in its current form because of the large amount of data involved(15k+ items). I just found out, the

相关标签:
9条回答
  • 2020-12-29 10:30

    This uses dynamic programming to solve the same problem you gave in the example. It's been updated to deal with duplicate values by keeping track of the value's index rather than its value, and to correct a bug which omitted some solutions.

    public class TurboAdder {
        private static final int[] data = new int[] { 5, 10, 20, 25, 40, 50 };
    
        private static class Node {
            public final int index;
            public final int count;
            public final Node prevInList;
            public final int prevSum;
            public Node(int index, int count, Node prevInList, int prevSum) {
                this.index = index;
                this.count = count;
                this.prevInList = prevInList;
                this.prevSum = prevSum;
            }
        }
    
        private static int target = 100;
        private static Node sums[] = new Node[target+1];
    
        // Only for use by printString.
        private static boolean forbiddenValues[] = new boolean[data.length];
    
        public static void printString(String prev, Node n) {
            if (n == null) {
                System.out.println(prev);
            } else {
                while (n != null) {
                    int idx = n.index;
                    // We prevent recursion on a value already seen.
                    if (!forbiddenValues[idx]) {
                        forbiddenValues[idx] = true;
                        printString((prev == null ? "" : (prev+" + "))+data[idx]+"*"+n.count, sums[n.prevSum]);
                        forbiddenValues[idx] = false;
                    }
                    n = n.prevInList;
                }
            }
        }
    
        public static void main(String[] args) {
            for (int i = 0; i < data.length; i++) {
                int value = data[i];
                for (int count = 1, sum = value; count <= 100 && sum <= target; count++, sum += value) {
                    for (int newsum = sum+1; newsum <= target; newsum++) {
                        if (sums[newsum - sum] != null) {
                            sums[newsum] = new Node(i, count, sums[newsum], newsum - sum);
                        }
                    }
                }
                for (int count = 1, sum = value; count <= 100 && sum <= target; count++, sum += value) {
                    sums[sum] = new Node(i, count, sums[sum], 0);
                }
            }
            printString(null, sums[target]);
    
        }
    }
    
    0 讨论(0)
  • 2020-12-29 10:33

    Your code does not match your problem statement and it is therefore unclear how to proceed.

    You say that the data list contains negative values and contains duplicates. You give an example which does both. In fact, the values are limited to non-zero integers in the range [-200,200] but the data list is at least 2,000 and typically 10,000 or more, so there would have to be duplicates.

    Let's review your "basic logic":

    for (int c = 100; c >= 0; c--) {
        if (c * x_k == current.sum) { //if result is correct then save
            solutions.add(new Context(0, 0, newcoeff));
            continue;
         } else if (current.k > 0) { // recurse with next data element
             contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
         }
    }
    

    Elsewhere you state that the data must be sorted in numerical order and you start from the tail of the list, k = n -1 (because of zero indexing), so you start with the biggest ones first. The then clause terminates the recursion. While this may be fine in the problem you are solving, it is not the problem you are describing, because it ignores all the combinations of lesser data values that sum to zero.

    On the other hand, all the combinations of greater values that sum to zero would be included.

    Let's look, for example, at the last item on your example list, 156, with target sum 5000.

    156 * 100 = 15600 so it will not match the target sum until you get into the negative numbers. Of course

    (100 * -100) + (100 * -6) + (100 * 156) = 5000
    

    and this combination works. (Your sample data set does not include a -100, but it does have two -40s and a -20, so if you want to be true to the data set combine them instead. I'm using -100 to keep the example simple and because you say the data set could include -100.)

    But of course

    (100 * -100) + (100 * -6) + (c * -1) + (c * 1) + (100 * 156) = 5000 
    

    for any c, so you will have 100 combinations like this in the output (1 <= c <= 100). But you have 50 in the data set. When you get to 100 * 50 = 5000 you terminate the recursion, so you will never get

    (c * -1) + (c * 1) + (100 * 50) = 5000 
    

    So either your code or your problem statement is buggy. Probably both, because even without considering the coefficients, 10,000 items taken 60 at a time yields on the order of 10^158 combinations, but aside from this premature termination of recursion, I see nothing that would prevent you from having to test the value of the sum of all those combinations, and even if there were zero cost in computing the values, you could not do that many comparisons.

    0 讨论(0)
  • 2020-12-29 10:39

    There are 137 unique values (ignoring the repeats) in the sample data given.

    If you concede that nearly any combination of 30 distinct values pulled from the data at random can be massaged into at least one valid solution by adjusting the coefficients, then there must be at least C(137,30)=1.54E30 solutions with exactly 30 terms (and another 5.31E30 with 31 terms, 1.76E31 with 32 terms, 5.60E31 with 33 terms, etc).

    So, if your goal is just a sampling of valid solutions, and not an impossible exhaustive list, then I submit that an ideal approach is to select your target number of terms at random and then adjust their coefficients to reach the target value to produce a single sample, and repeat for the desired number of samples.

    Below is a program that applies this technique. On my fairly modest laptop (1.3Ghz AMD E-300), I produced 15000 unique solutions in 2.249 seconds targeting 32 terms per sample.

    import java.util.Arrays;
    import java.util.HashMap;
    import java.util.Map;
    import java.util.Map.Entry;
    import java.util.Random;
    import java.util.TreeMap;
    
    public class ComboGen {
    
        private static final int[] data = { -193, -138, -92, -80, -77, -70, -63, -61, -60, -56, -56, -55, -54, -54, -51, -50, -50,
                -50, -49, -49, -48, -46, -45, -44, -43, -43, -42, -42, -42, -42, -41, -41, -40, -40, -39, -38, -38, -38, -37, -37,
                -37, -37, -37, -36, -36, -36, -35, -34, -34, -34, -34, -34, -34, -34, -33, -33, -33, -32, -32, -32, -32, -32, -32,
                -32, -32, -31, -31, -31, -31, -31, -31, -31, -30, -30, -30, -30, -30, -29, -29, -29, -29, -29, -29, -29, -29, -29,
                -28, -28, -28, -28, -27, -27, -27, -27, -26, -26, -26, -26, -26, -26, -25, -25, -25, -25, -25, -25, -25, -25, -24,
                -24, -24, -24, -24, -24, -24, -24, -24, -24, -23, -23, -23, -23, -23, -23, -23, -23, -22, -22, -22, -22, -22, -22,
                -22, -22, -22, -21, -21, -21, -21, -21, -21, -21, -20, -20, -20, -20, -20, -20, -20, -19, -19, -19, -19, -19, -19,
                -19, -19, -19, -19, -19, -19, -19, -19, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18, -18,
                -18, -18, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -17, -16, -16, -16, -16,
                -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -16, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15,
                -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -15, -14, -14, -14, -14,
                -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -14, -13, -13, -13, -13, -13, -13, -13,
                -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13, -13,
                -13, -13, -13, -13, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12, -12,
                -12, -12, -12, -12, -12, -12, -12, -12, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11,
                -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -11, -10, -10, -10, -10, -10,
                -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10, -10,
                -10, -10, -10, -10, -10, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9,
                -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9, -9,
                -9, -9, -9, -9, -9, -9, -9, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8,
                -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -8, -7,
                -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7,
                -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7,
                -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -7, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6,
                -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6,
                -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6, -6,
                -6, -6, -6, -6, -6, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5,
                -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5,
                -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5, -5,
                -5, -5, -5, -5, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
                -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
                -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
                -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4, -4,
                -4, -4, -4, -4, -4, -4, -4, -4, -4, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3,
                -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3,
                -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3,
                -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3,
                -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -3, -2, -2, -2, -2,
                -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2,
                -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2,
                -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2,
                -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2,
                -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -2, -1, -1, -1, -1, -1, -1,
                -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
                -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
                -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
                -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
                -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1,
                1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2,
                2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
                2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
                2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
                2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3,
                3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
                3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
                3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
                3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4,
                4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
                4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
                4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
                5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
                5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
                5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
                6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
                6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
                7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
                7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
                8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
                8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
                9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10,
                10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
                10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
                11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12,
                12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13,
                13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
                14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15,
                15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
                16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
                18, 18, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 21,
                21, 21, 21, 21, 21, 21, 21, 21, 21, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 24,
                24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 26, 26,
                26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 30,
                30, 30, 31, 31, 31, 31, 31, 32, 32, 32, 32, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 36, 36, 36,
                36, 37, 37, 38, 39, 39, 39, 40, 41, 41, 41, 41, 41, 42, 42, 43, 43, 44, 45, 45, 46, 47, 47, 48, 48, 49, 49, 50, 54,
                54, 54, 55, 55, 56, 56, 57, 57, 57, 57, 57, 58, 58, 58, 59, 60, 66, 67, 68, 70, 72, 73, 73, 84, 84, 86, 92, 98, 99,
                105, 114, 118, 120, 121, 125, 156 };
    
        private static Map<Integer, Integer> buildCounts(int[] rawData) {
            Map<Integer, Integer> buckets = new TreeMap<Integer, Integer>();
            int i = 0;
            for (i = 0; i < rawData.length; i++) {
                if (buckets.containsKey(rawData[i])) {
                    buckets.put(rawData[i], buckets.get(rawData[i]) + 1);
                } else {
                    buckets.put(rawData[i], 1);
                }
            }
            weights = new int[buckets.size()];
            counts = new int[buckets.size()];
            i = 0;
            for (Entry<Integer, Integer> entry : buckets.entrySet()) {
                weights[i] = entry.getKey();
                counts[i] = entry.getValue();
                i++;
            }
            return buckets;
        }
    
        private static int[] weights;
        private static int[] counts;
        private static Random random = new Random(System.nanoTime());
    
        private static int[] placeChips(int[] chips, int targetPairs) {
            int unplaced = targetPairs;
            int[] placements = new int[unplaced];
            Arrays.fill(chips, 0);
            if (unplaced > chips.length) {
                throw new IllegalStateException("Coefficient pairs must not exceed unique data values.");
            }
            while (unplaced > 0) {
                int idx = random.nextInt(counts.length);
                if (chips[idx] == 0) {
                    chips[idx] = 1 + random.nextInt(100);
                    unplaced--;
                }
            }
            int ppos = 0;
            for (int cpos = 0; cpos < chips.length; cpos++) {
                if (chips[cpos] > 0) {
                    placements[ppos++] = cpos;
                }
            }
            return placements;
        }
    
        static int sum(int[] chips) {
            int sum = 0;
            for (int i = 0; i < chips.length; i++) {
                sum += weights[i] * chips[i];
            }
            return sum;
        }
    
        public static void adjustFactors(int[] chips, int[] placements, int target) {
            int sum = sum(chips);
            Map<Integer, Integer> weightIdx = new HashMap<Integer, Integer>();
            for (int placement : placements) {
                weightIdx.put(weights[placement], placement);
            }
            while (sum != target) {
                // System.out.print(sum + ",");
                int idx = 0;
                if ((sum > target) && weightIdx.containsKey(sum - target) && (chips[weightIdx.get(sum - target)] > 1)) {
                    idx = weightIdx.get(sum - target);
                } else if ((sum < target) && weightIdx.containsKey(sum - target) && (chips[weightIdx.get(sum - target)] > 1)) {
                    idx = weightIdx.get(sum - target);
                } else if ((sum > target) && weightIdx.containsKey(target - sum) && chips[weightIdx.get(target - sum)] < 100) {
                    idx = weightIdx.get(target - sum);
                } else if ((sum < target) && weightIdx.containsKey(target - sum) && chips[weightIdx.get(target - sum)] < 100) {
                    idx = weightIdx.get(target - sum);
                } else {
                    idx = placements[random.nextInt(placements.length)];
                }
                int weight = weights[idx];
                if (sum < target) {
                    if (weight > 0 && chips[idx] < 100) {
                        chips[idx]++;
                        sum += weight;
                    } else if (weight < 0 && chips[idx] > 1) {
                        chips[idx]--;
                        sum -= weight;
                    }
                } else {
                    if (weight > 0 && chips[idx] > 1) {
                        chips[idx]--;
                        sum -= weight;
                    } else if (weight < 0 && chips[idx] < 100) {
                        chips[idx]++;
                        sum += weight;
                    }
                }
            }
        }
    
        private static String oneRandomSet(int targetSum, int targetPairs) {
            int[] chips = new int[counts.length];
            int[] placements = placeChips(chips, targetPairs);
            adjustFactors(chips, placements, targetSum);
            int sum = sum(chips);
            StringBuffer sb = new StringBuffer();
            for (int placement : placements) {
                sb.append(weights[placement]);
                sb.append(" * ");
                sb.append(chips[placement]);
                sb.append(" + ");
            }
            sb.setLength(sb.length() - 2);
            sb.append(" = ");
            sb.append(sum);
            sb.append("\n");
            return sb.toString();
        }
    
        public static void main(String[] ARGV) {
            int targetSum = 5000;
            int targetPairs = 32;
            int targetResults = 15000; // Produce this many solutions
            buildCounts(data);
            StringBuffer sb = new StringBuffer();
            long timer = System.nanoTime();
            for (int i = 0; i < targetResults; i++) {
                sb.append(oneRandomSet(targetSum, targetPairs));
            }
            double seconds = (System.nanoTime() - timer) / 1000000000d;
            double millisPerSol = 1000 * seconds / targetResults;
            System.out.println(sb.toString());
            System.out.println(String.format("%d solutions in %1.3f seconds @ %1.3f millis per sol", targetResults, seconds,
                    millisPerSol));
        }
    
    }
    
    0 讨论(0)
  • 2020-12-29 10:41

    you might like to check "Dynamic Programming" concept , dynamic programming mainly saves huge time unlike the normal recursion ; since it avoids re-computing values by saving them in a form of 2D array , this tutorial might help you

    Note : Knapsack problem is considered the introduction problem to Dynamic programming , searching for "knapsack dynamic programming" would help you more

    0 讨论(0)
  • 2020-12-29 10:43

    If you do not have to have exhaustive and precise solution, you can try to approximate the problem. The program will then run in pseudo-polynomial or even polynomial time.

    See http://en.m.wikipedia.org/wiki/Knapsack_problem#Approximation_algorithms

    0 讨论(0)
  • 2020-12-29 10:45

    Edit: I'm keeping this answer (for now at least) to preserve the comment thread

    OK, I'm confused, but I'll post my thoughts anyway and edit/delete them later if I'm wrong. If I'm really far off the mark, you can just say so and I'll delete this whole answer.

    First of all, it looks to me that since zero is a valid data value and zero works in all positions, you are getting yourself into extra trouble computing all those combinations. Worse, it looks to me like your algorithm has an actual bug in that it will miss some combinations where a combination of items toward the beginning of the list sum to zero, since you terminate that thread of investigation once you find a combination of items toward the end of the list that yields the target sum.

    Next, it looks to me like for every item in the list, you are trying 100 (actually 101) different values: x*100, x*99, ..., x*0. If I am right, than it follows that the size of the problem space is 100^n where n is the number of data elements. There is no possible way you are examining that for n=100 let alone n=10,000. The only way your program could even be terminating is because you find sums at the end of the list and terminate those threads of investigation. (Oh right, now you tell me you terminate threads when the number of elements with non-zero coefficients exceeds 3, no 50, no 60, no some variable number. The problem space is still too big.)

    In fact, by my count, your test data has 283 zeros. So you can add 100^283 combinations of those data elements times [1,100] to any other answer you get. Given the number of particles in the universe is estimated to be 10^80, 100^283 combinations would be impossible to print on paper.

    Or else I've gotten something wrong. If so, please clue me in.

    0 讨论(0)
提交回复
热议问题