Picking five numbers that sum to S

前端 未结 4 1569
孤街浪徒
孤街浪徒 2021-01-31 10:51

Given an array A of N nonnegative numbers, I\'m interested in finding the number of ways you can pick 5 numbers (from distinct positions in the array)

相关标签:
4条回答
  • 2021-01-31 11:05

    Maybe it's better to create an array with only distinct values first and count the occurence of them in the originally array. Because only the number of solutions is wanted and not the solutions itself, that could be faster if using combination calculations.

    1) Sort array A O(N log N)

    2) Create a new array B where all values are distinct. Also save the count of the occurence of the value in the original array A for every element in B. O(N)

    3) Create a new array C with sums of two elements of B. Including sums of the same element if the count > 1. Also save the both indexes of the elements from B. O(|B|2)

    4) Sort array C by the sums O(|B|2 (log |B|2))

    5) For every element in B find two valid elements from C so that the three values sum up to S and the indexes are in the same order. In pseudo code:

    num=0
    for (i=0; i<n; i++)
      j=i
      k=|C|-1
      while (j <= k)
        if (c[j].sum + c[k].sum = S - b[i].value)
          for (m=0; m<c[j].index.length; m++)
            for (n=0; n<c[k].index.length; n++)
              if (i < c[j].index[m].left < c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
                num+=b[i].count * b[c[j].index[m].left].count * b[c[j].index[m].right].count * b[c[j].index[k].left].count * b[c[j].index[k].right].count
              else if (b[i].count > 1 && i = c[j].index[m].left < c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
                num+= binomialcoefficient(b[i].count, 2) * b[c[j].index[m].right].count * b[c[j].index[k].left].count * b[c[j].index[k].right].count
              else if (b[c[j].index[m].left].count > 1 && i < c[j].index[m].left = c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
                num+= b[i].count * binomialcoefficient(b[c[j].index[m].left].count, 2) * b[c[j].index[k].left].count * b[c[j].index[k].right].count
              [..]
              else if (b[i].count > 2 && i = c[j].index[m].left = c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
                num+= binomialcoefficient(b[i].count, 3) * b[c[j].index[k].left].count * b[c[j].index[k].right].count
              [..]
              else if (b[i].count > 1 && b[c[j].index[m].right].count > 1 && i = c[j].index[m].left < c[j].index[m].right = c[j].index[k].left < c[j].index[k].right)
                num+= binomialcoefficient(b[i].count, 2) * binomialcoefficient(b[c[j].index[m].right].count, 2) * b[c[j].index[k].right].count
              [..]
              else if (b[i].count > 4 && i = c[j].index[m].left = c[j].index[m].right = c[j].index[k].left = c[j].index[k].right)
                num+= binomialcoefficient(b[i].count, 5)
        if (c[j].sum + c[k].sum >= S - b[i].value)
          k--
        if (c[j].sum + c[k].sum <= S - b[i].value)
          j++
    

    I'm not really sure what time complexity this has. The outer for loop is bound by O(|B|), the while loop by O(|B|2), the inner for loops by O(|B|), because B has only distinct values. So obvisouly it's in O(|B|5). But its O(N) if all elements in A have the same value and if all values are distinct and sufficiently random its maybe possible to bound the number of indexes per sum in C by a constant, which would lead to O(N3).

    The worst case could be somewhere with half the values equal and the other half random or with all numbers distinct but a lot of duplicate sums. But that would also lead to a much shorter while loop. I have a feeling that the while and the two inner for loops are bound by O(N2), so O(N3) in total for all cases, but i can't proof it.

    Also an interesting question here is what is the maximum number of possibilities to pick up 5 numbers which sum to S for an array of N disctinct numbers. If it is in O(N5) the worst case of that algorithm is also O(N5).

    Maybe try it out ;).

    0 讨论(0)
  • 2021-01-31 11:10

    You can do it in O(N*S) with dynamic programming:

    static int count(int[] A, int S) {
        final int K = 5;
        // In count[n][s] we'll count the number of ways you can pick n numbers such that their sum is s
        int[][] count = new int[K+1][S+1];
    
        count[0][0] = 1;  // The base case
        for (int i = 0; i < A.length; i++)
            for (int n = K; n >= 1; n--)
                for (int s = A[i]; s <= S; s++)
                    count[n][s] += count[n-1][s - A[i]];
    
        return count[K][S];
    }
    
    0 讨论(0)
  • 2021-01-31 11:17

    O(N^3) seem possible (though I haven't tried proving it).

    Take all possible pairs and create a new array (say B) of size O(N^2) which holds the sum of all possible pairs. Also keep track of the index of two elements from the original array which gave that sum. - O(N^2)

    Now sort the array - O(N^2LogN).

    Now for each element a in the original array, try to find two elements from B which sum to S-a. Since B is sorted this can be done in O(B) time: Start with two pointers, one at the max and one at the min.

    If Sum of those two > S-a, decrement the pointer near max.

    If Sum of those two < S-a, increment the pointer near min.

    If the sum is equal, then you have found one candidate pair and a new sorted sub-array in which to look for the next possible candidate pair. (You should ensure that the two elements of B come from four elements of A). (There might be potential issues here)

    Thus you can count the number of times S-a occurs as a sum of two elements of B, which come from four elements of the original array (not including a).

    So O(N^2) time for O(N) elements - O(N^3).

    Hope that helps.

    0 讨论(0)
  • 2021-01-31 11:27

    I think the fact that the numbers must have distinct positions is a red herring. You can use the inclusion-exclusion principle to count the number of all positions (i,j,k,l,m) where x[i]+x[j]+x[k]+x[l]+x[m]=S and i,j,k,l,m are distinct:

     sums with i!=j,i!=k,i!=l...,l!=m  = all sums 
                                       - sums with i=j
                                       - ...
                                       - sums with l=m
                                       + sums with i=j and j=k
                                       + ...
                                       + sums with k=l and l=m
                                       - ...
                                       + sums with i=j=k=l=m
    

    Computing the sums on the right, except the first one, is doable in O(N^2 log N). For example, to find the number of positions (i,i,k,l,m) such that x[i]+x[i]+x[k]+x[l]+x[m]=S you can create sorted arrays with sums {2a+b} and {c+d} and check if they have elements x,y such that x+y=S.

    Main algorithm

    So it's enough to compute how many are there positions (i,j,k,l,m) where x[i]+x[j]+x[k]+x[l]+x[m]=S and i,j,k,l,m are not necessarily different. Basically, you can use Moron's solution this way:

    • Create a sorted array of sums {a+b: a,b are numbers from array}; group equal elements into one, remembering count. For example, for array [1,1,3] you get nine sums [2,2,2,2,4,4,4,4,6] of the form a+b. Then you group same elements remembering counts: [(2,4),(4,4),(6,1)]. This step is O(N^2 log N).

    • For each e, count how many are there pairs of elements in the array that sum to S-e. As in Moron's solution, you have two pointers, one going right, one going left. If the sum is too low, move the first pointer increasing the sum; if the sum is too high, move the second pointer decreasing it.

      Suppose the sum is correct. This means one points to (a,x) and second to (b,y) where a+b=S-e. Increase the counter by x*y and move both pointers (You could move only one pointer, but on the next step there would be no match, and the second pointer would be moved then.).

    For example, for [(2,4),(4,4),(6,1)] array and S-e=8, the first pointer points at (2,4) and the second at (6,1). Since 2+6=8, you add 4 and move both pointers. Now they both point at (4,4), so you increase the counter by 16. Don't stop! The pointers pass each other, and you get first at (6,1), second at (2,4), increase the counter by 4.

    So, in the end, there are 4+16+4=24 ways to get 8 as a sum of 4 elements of [1,1,3]:

    >>> len([k for k in itertools.product([1,1,3],repeat=4) if sum(k) == 8])
    24
    
    Prelude Control.Monad> length [k | k <- replicateM 4 [1,1,3], sum k == 8]
    24
    

    Repeating that for each e, you'll get count of ways to get S as a sum of 5 elements.

    For [1,1,1,1,1] and S-e=4, the sums array would be [(2,25)], and you'd get that there are 625 ways to get sum of 4.

    For each e, this step is linear in size of the array (so it's O(N2)), so the loop takes O(N3).

    On inclusion-exclusion:

    Call a quintuple (i,j,k,l,m) "proper" if x[i]+x[j]+x[k]+x[l]+x[m]=S. The goal is to count the number of proper quintuples (i,j,k,l,m) where i,j,k,l,m are pairwise distinct. The main algorithm can count in O(N^3) how many are there proper quintuples which have not necessarily distinct components. The remaining thing is to count those "wrong" tuples.

    Consider the subsets of proper quintuples

    Axy={(i,j,k,l,m): indices on x-th and y-th place are the same}

    For example, A24 is the set of proper quintuples (i,j,k,l,m) where j=l.

    The set of wrong quintuples is:

    A12 ∪ A13 ∪ ... ∪ A45

    Counting its cardinality by inclusion-exclusion:

    |A12 ∪ A13 ∪ ... ∪ A45| = |A12| + |A13| + ... + |A45| - |A12 ∩ A23| - ... - |A34 ∩ A45| + ... + |A12 ∩ A23 ∩ ... ∩ A35 ∩ A45|

    There are 210=1024 summands here. But a lot of the cardinalities is the same.

    The only things you have to count is:

    • X1 = |A12| - quintuples with i=j
    • X2 = |A12 ∩ A23| - quintuples with i=j=k
    • X3 = |A12 ∩ A23 ∩ A34| - quintuples with i=j=k=l
    • X4 = |A12 ∩ A23 ∩ A34 ∩ A45| - quintuples with i=j=k=l=m
    • X5 = |A12 ∩ A34| - quintuples with i=j,k=l
    • X6 = |A12 ∩ A23 ∩ A45| - quintuples with i=j=k,l=m

    You can observe, by permuting, all other sets are represented here. For example, A24 has the same cardinality as A12.

    Counting cardinalities of those 6 sets is rather easy. For the first one, you create arrays {2a+b} and {c+d} and count how many are there common elements; for the other ones, there are only 3 or less free variables, so even a simple loop will give you O(N^3).

    To simplify the sum, I wrote the following Haskell program:

    import Control.Monad
    import Data.List
    import qualified Data.Map as Map
    
    -- Take equivalence relation, like [(1,2),(2,3)] and return its partition, like [3,1,1]
    f xs = sort $ map length $ foldr f (map return [1..5]) xs
           where f (x,y) a = let [v1] = filter (x `elem`) a
                                 [v2] = filter (y `elem`) a
                             in if v1 == v2 then a else (a \\ [v1,v2]) ++ [v1++v2]
    
    -- All 1024 subsets of [(1,2),(1,3), ..., (4,5)]
    subsets = filterM (const [False, True]) [(i,j) | i <- [1..5], j <- [i+1..5]]
    
    res = Map.fromListWith (+) $ map (\k -> (f k, (-1)^(length k))) subsets
    
    *Main> res
    Loading package array-0.3.0.1 ... linking ... done.
    Loading package containers-0.3.0.0 ... linking ... done.
    fromList [([1,1,1,1,1],1),([1,1,1,2],-10),([1,1,3],20),([1,2,2],15),([1,4],-30),([2,3],-20),([5],24)]
    

    which means that the formula is

    all subsets - 10X1 + 20X2 - 30X3 + 24X4 + 15X5 - 20X6.

    Check:

    How many are there quintuples in [0,0,0,...,0] summing up to 0? One way to compute that is directly, second way is to use the formula (and not care about distinct positions):

    direct x = x*(x-1)*(x-2)*(x-3)*(x-4)
    indirect x = x^5 - 10 * x^4 + 20 * x^3 + 15 * x^3 - 30 * x^2 - 20*x^2 + 24*x
    
    *Main> direct 100
    9034502400
    *Main> indirect 100
    9034502400
    

    Other remarks:

    Also, there's O(an log an) solution: Compute (xa1 + ... + xan)5 using FFT, the result is coefficient at xS. This allows some ai to be used twice, but you can subtract polynomials like (x2a1 + ... + x2an)5*(xa1 + ... + xan)3 etc. according to inclusion-exclusion principle.

    In some restricted models of computation, it has been shown decision version of this problem requires O(N^3) time.

    0 讨论(0)
提交回复
热议问题