Getting the lowest possible sum from numbers' difference

后端未结

关注

 10  1281

滥情空心

I have to find the lowest possible sum from numbers\' difference.

Let\'s say I have 4 numbers. 1515, 1520, 1500 and 1535. The lowest sum of difference is 30, becaus

相关标签:

10条回答

北海茫月

2020-12-23 22:46
I've taken an approach which uses a recursive algorithm, but it does take some of what other people have contributed.

First of all we sort the numbers:
```
[1561,1572,1572,1609,1682,1731,1731,2041]
```
Then we compute the differences, keeping track of which the indices of the numbers that contributed to each difference:
```
[(11,(0,1)),(0,(1,2)),(37,(2,3)),(73,(3,4)),(49,(4,5)),(0,(5,6)),(310,(6,7))]
```
So we got 11 by getting the difference between number at index 0 and number at index 1, 37 from the numbers at indices 2 & 3.

I then sorted this list, so it tells me which pairs give me the smallest difference:
```
[(0,(1,2)),(0,(5,6)),(11,(0,1)),(37,(2,3)),(49,(4,5)),(73,(3,4)),(310,(6,7))]
```
What we can see here is that, given that we want to select n numbers, a naive solution might be to select the first n / 2 items of this list. The trouble is, in this list the third item shares an index with the first, so we'd only actually get 5 numbers, not 6. In this case you need to select the fourth pair as well to get a set of 6 numbers.

From here, I came up with this algorithm. Throughout, there is a set of accepted indices which starts empty, and there's a number of numbers left to select n:
1. If n is 0, we're done.
2. if n is 1, and the first item will provide just 1 index which isn't in our set, we taken the first item, and we're done.
3. if n is 2 or more, and the first item will provide 2 indices which aren't in our set, we taken the first item, and we recurse (e.g. goto 1). This time looking for n - 2 numbers that make the smallest difference in the remainder of the list.
This is the basic routine, but life isn't that simple. There are cases we haven't covered yet, but make sure you get the idea before you move on.

Actually step 3 is wrong (found that just before I posted this :-/), as it may be unnecessary to include an early difference to cover indices which are covered by later, essential differences. The first example ([1515, 1520, 1500, 1535]) falls foul of this. Because of this I've thrown it away in the section below, and expanded step 4 to deal with it.

So, now we get to look at the special cases:
1. ** as above **
2. ** as above **
3. If n is 1, but the first item will provide two indices, we can't select it. We have to throw that item away and recurse. This time we're still looking for n indices, and there have been no changes to our accepted set.
4. If n is 2 or more, we have a choice. Either we can a) choose this item, and recurse looking for n - (1 or 2) indices, or b) skip this item, and recurse looking for n indices.
4 is where it gets tricky, and where this routine turns into a search rather than just a sorting exercise. How can we decide which branch (a or b) to take? Well, we're recursive, so let's call both, and see which one is better. How will we judge them?
- We'll want to take whichever branch produces the lowest sum.
- ...but only if it will use up the right number of indices.
So step 4 becomes something like this (pseudocode):
```
x       = numberOfIndicesProvidedBy(currentDifference)
branchA = findSmallestDifference (n-x, remainingDifferences) // recurse looking for **n-(1 or 2)**
branchB = findSmallestDifference (n  , remainingDifferences) // recurse looking for **n** 
sumA    = currentDifference + sumOf(branchA)
sumB    =                     sumOf(branchB) 

validA  = indicesAddedBy(branchA) == n
validB  = indicesAddedBy(branchB) == n

if not validA && not validB then return an empty branch

if validA && not validB then return branchA
if validB && not validA then return branchB

// Here, both must be valid.
if sumA <= sumB then return branchA else return branchB
```
I coded this up in Haskell (because I'm trying to get good at it). I'm not sure about posting the whole thing, because it might be more confusing than useful, but here's the main part:
```
findSmallestDifference = findSmallestDifference' Set.empty

findSmallestDifference' _     _ [] = []
findSmallestDifference' taken n (d:ds)
    | n == 0                = []    -- Case 1
    | n == 1 && provides1 d = [d]   -- Case 2
    | n == 1 && provides2 d = findSmallestDifference' taken n ds -- Case 3
    | provides0 d           = findSmallestDifference' taken n ds -- Case 3a (See Edit)
    | validA && not validB             = branchA -- Case 4
    | validB && not validA             = branchB -- Case 4
    | validA && validB && sumA <= sumB = branchA -- Case 4
    | validA && validB && sumB <= sumA = branchB -- Case 4
    | otherwise             = []                 -- Case 4
        where branchA = d : findSmallestDifference' (newTaken d) (n - (provides taken d)) ds
              branchB = findSmallestDifference' taken n ds
              sumA    = sumDifferences branchA
              sumB    = sumDifferences branchB
              validA  = n == (indicesTaken branchA)
              validB  = n == (indicesTaken branchA)
              newTaken x = insertIndices x taken 
```
Hopefully you can see all the cases there. That code(-ish), plus some wrapper produces this:
```
*Main> findLeastDiff 6 [1731, 1572, 2041, 1561, 1682, 1572, 1609, 1731]
Smallest Difference found is 48
      1572 -   1572 =      0
      1731 -   1731 =      0
      1572 -   1561 =     11
      1609 -   1572 =     37
*Main> findLeastDiff 4 [1515, 1520, 1500,1535]
Smallest Difference found is 30
      1515 -   1500 =     15
      1535 -   1520 =     15
```
This has become long, but I've tried to be explicit. Hopefully it was worth while.

Edit : There is a case 3a that can be added to avoid some unnecessary work. If the current difference provides no additional indices, it can be skipped. This is taken care of in step 4 above, but there's no point in evaluating both halves of the tree for no gain. I've added this to the Haskell.
0 讨论(0)
发布评论:

提交评论
- 加载中...
既然无缘

2020-12-23 22:47

I would go with answer of marcog, you can sort using any of the sorting algoriothms. But there is little thing to analyze now.

If you have to choose R numbers out N numbers so that the sum of their differences is minimum then the numbers be chosen in a sequence without missing any numbers in between.

Hence after sorting the array you should run an outer loop from 0 to N-R and an inner loop from 0 to R-1 times to calculate the sum of differnces.

If needed, you should try with some examples.

0 讨论(0)
发布评论:

提交评论
- 加载中...
生来不讨喜

2020-12-23 22:52
Something like
1. Sort List
2. Find Duplicates
3. Make the duplicates a pair
4. remove duplicates from list
5. break rest of list into pairs
6. calculate differences of each pair
7. take lowest amounts
In your example you have 8 number and need the best 3 pairs. First sort the list which gives you
```
1561, 1572, 1572, 1609, 1682, 1731, 1731, 2041
```
If you have duplicates make them a pair and remove them from the list so you have
```
[1572, 1572] = 0
[1731, 1731] = 0
L = { 1561, 1609, 1682, 2041 }
```
Break the remaining list into pairs, giving you the 4 following pairs
```
[1572, 1572] = 0
[1731, 1731] = 0
[1561, 1609] = 48
[1682, 2041] = 359
```
Then drop the amount of numbers you need to.

This gives you the following 3 pairs with the lowest pairs
```
[1572, 1572] = 0
[1731, 1731] = 0
[1561, 1609] = 48
```
So
```
0 + 0 + 48 = 48
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

时光说笑

2020-12-23 22:56

I know you said you did not need code but it is the best way for me to describe a set based solution. The solution runs under SQL Server 2008. Included in the code is the data for the two examples you give. The sql solution could be done with a single self joining table but I find it easier to explain when there are multiple tables.

    --table 1 holds the values

declare @Table1 table (T1_Val int)
Insert @Table1 
--this data is test 1
--Select (1515) Union ALL
--Select (1520) Union ALL
--Select (1500) Union ALL
--Select (1535) 

--this data is test 2
Select (1731) Union ALL
Select (1572) Union ALL
Select (2041) Union ALL
Select (1561) Union ALL
Select (1682) Union ALL
Select (1572) Union ALL
Select (1609) Union ALL
Select (1731) 
--Select * from @Table1

--table 2 holds the sorted numbered list
Declare @Table2 table (T2_id int identity(1,1), T1_Val int)
Insert @Table2 Select T1_Val from @Table1 order by T1_Val

--table 3 will hold the sorted pairs
Declare @Table3 table (T3_id int identity(1,1), T21_id int, T21_Val int, T22_id int, T22_val int)
Insert @Table3
Select T2_1.T2_id, T2_1.T1_Val,T2_2.T2_id, T2_2.T1_Val from @Table2 AS T2_1
LEFT Outer join @Table2 AS T2_2 on T2_1.T2_id = T2_2.T2_id +1

--select * from @Table3
--remove odd numbered rows
delete from @Table3 where T3_id % 2 > 0 

--select * from @Table3
--show the diff values
--select *, ABS(T21_Val - T22_val) from @Table3
--show the diff values in order
--select *, ABS(T21_Val - T22_val) from @Table3 order by ABS(T21_Val - T22_val)
--display the two lowest
select TOP 2 CAST(T22_val as varchar(24)) + ' and ' + CAST(T21_val as varchar(24)) as 'The minimum difference pairs are'
, ABS(T21_Val - T22_val) as 'Difference'
from @Table3
ORDER by ABS(T21_Val - T22_val)

0 讨论(0)

上一页 1 2