fitting n variable height images into 3 (similar length) column layout

问题

I'm looking to make a 3-column layout similar to that of piccsy.com. Given a number of images of the same width but varying height, what is a algorithm to order them so that the difference in column lengths is minimal? Ideally in Python or JavaScript...

Thanks a lot for your help in advance!

Martin

回答1:

How many images?

If you limit the maximum page size, and have a value for the minimum picture height, you can calculate the maximum number of images per page. You would need this when evaluating any solution.

I think there were 27 pictures on the link you gave.

The following uses the first_fit algorithm mentioned by Robin Green earlier but then improves on this by greedy swapping.

The swapping routine finds the column that is furthest away from the average column height then systematically looks for a swap between one of its pictures and the first picture in another column that minimizes the maximum deviation from the average.

I used a random sample of 30 pictures with heights in the range five to 50 'units'. The convergenge was swift in my case and improved significantly on the first_fit algorithm.

The code (Python 3.2:

def first_fit(items, bincount=3):
    items = sorted(items, reverse=1) # New - improves first fit.
    bins     = [[] for c in range(bincount)]
    binsizes = [0] * bincount
    for item in items:
        minbinindex = binsizes.index(min(binsizes))
        bins[minbinindex].append(item)
        binsizes[minbinindex] += item
    average = sum(binsizes) / float(bincount)
    maxdeviation = max(abs(average - bs) for bs in binsizes)

    return bins, binsizes, average, maxdeviation

def swap1(columns, colsize, average, margin=0):
    'See if you can do a swap to smooth the heights'
    colcount = len(columns)
    maxdeviation, i_a = max((abs(average - cs), i)
                              for i,cs in enumerate(colsize))
    col_a = columns[i_a]
    for pic_a in set(col_a): # use set as if same height then only do once
        for i_b, col_b in enumerate(columns):
            if i_a != i_b: # Not same column
                for pic_b in set(col_b):
                    if (abs(pic_a - pic_b) > margin): # Not same heights
                        # new heights if swapped
                        new_a = colsize[i_a] - pic_a + pic_b
                        new_b = colsize[i_b] - pic_b + pic_a
                        if all(abs(average - new) < maxdeviation
                               for new in (new_a, new_b)):
                            # Better to swap (in-place)
                            colsize[i_a] = new_a
                            colsize[i_b] = new_b
                            columns[i_a].remove(pic_a)
                            columns[i_a].append(pic_b)
                            columns[i_b].remove(pic_b)
                            columns[i_b].append(pic_a)
                            maxdeviation = max(abs(average - cs)
                                               for cs in colsize)
                            return True, maxdeviation
    return False, maxdeviation

def printit(columns, colsize, average, maxdeviation):
    print('columns')
    pp(columns)
    print('colsize:', colsize)
    print('average, maxdeviation:', average, maxdeviation)
    print('deviations:', [abs(average - cs) for cs in colsize])
    print()


if __name__ == '__main__':
    ## Some data
    #import random
    #heights = [random.randint(5, 50) for i in range(30)]
    ## Here's some from the above, but 'fixed'.
    from pprint import pprint as pp

    heights = [45, 7, 46, 34, 12, 12, 34, 19, 17, 41,
               28, 9, 37, 32, 30, 44, 17, 16, 44, 7,
               23, 30, 36, 5, 40, 20, 28, 42, 8, 38]

    columns, colsize, average, maxdeviation = first_fit(heights)
    printit(columns, colsize, average, maxdeviation)
    while 1:
        swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
        printit(columns, colsize, average, maxdeviation)
        if not swapped:
            break
        #input('Paused: ')

The output:

columns
[[45, 12, 17, 28, 32, 17, 44, 5, 40, 8, 38],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 34, 9, 37, 44, 30, 20, 28]]
colsize: [286, 267, 248]
average, maxdeviation: 267.0 19.0
deviations: [19.0, 0.0, 19.0]

columns
[[45, 12, 17, 28, 17, 44, 5, 40, 8, 38, 9],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 34, 37, 44, 30, 20, 28, 32]]
colsize: [263, 267, 271]
average, maxdeviation: 267.0 4.0
deviations: [4.0, 0.0, 4.0]

columns
[[45, 12, 17, 17, 44, 5, 40, 8, 38, 9, 34],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 37, 44, 30, 20, 28, 32, 28]]
colsize: [269, 267, 265]
average, maxdeviation: 267.0 2.0
deviations: [2.0, 0.0, 2.0]

columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]

columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]

Nice problem.

Heres the info on reverse-sorting mentioned in my separate comment below.

>>> h = sorted(heights, reverse=1)
>>> h
[46, 45, 44, 44, 42, 41, 40, 38, 37, 36, 34, 34, 32, 30, 30, 28, 28, 23, 20, 19, 17, 17, 16, 12, 12, 9, 8, 7, 7, 5]
>>> columns, colsize, average, maxdeviation = first_fit(h)
>>> printit(columns, colsize, average, maxdeviation)
columns
[[46, 41, 40, 34, 30, 28, 19, 12, 12, 5],
 [45, 42, 38, 36, 30, 28, 17, 16, 8, 7],
 [44, 44, 37, 34, 32, 23, 20, 17, 9, 7]]
colsize: [267, 267, 267]
average, maxdeviation: 267.0 0.0
deviations: [0.0, 0.0, 0.0]

If you have the reverse-sorting, this extra code appended to the bottom of the above code (in the 'if name == ...), will do extra trials on random data:

for trial in range(2,11):
    print('\n## Trial %i' % trial)
    heights = [random.randint(5, 50) for i in range(random.randint(5, 50))]
    print('Pictures:',len(heights))
    columns, colsize, average, maxdeviation = first_fit(heights)
    print('average %7.3f' % average, '\nmaxdeviation:')
    print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
    swapcount = 0
    while maxdeviation:
        swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
        if not swapped:
            break
        print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
        swapcount += 1
    print('swaps:', swapcount)

The extra output shows the effect of the swaps:

## Trial 2
Pictures: 11
average  72.000 
maxdeviation:
 9.72% =  7.000
swaps: 0

## Trial 3
Pictures: 14
average 118.667 
maxdeviation:
 6.46% =  7.667
 4.78% =  5.667
 3.09% =  3.667
 0.56% =  0.667
swaps: 3

## Trial 4
Pictures: 46
average 470.333 
maxdeviation:
 0.57% =  2.667
 0.35% =  1.667
 0.14% =  0.667
swaps: 2

## Trial 5
Pictures: 40
average 388.667 
maxdeviation:
 0.43% =  1.667
 0.17% =  0.667
swaps: 1

## Trial 6
Pictures: 5
average  44.000 
maxdeviation:
 4.55% =  2.000
swaps: 0

## Trial 7
Pictures: 30
average 295.000 
maxdeviation:
 0.34% =  1.000
swaps: 0

## Trial 8
Pictures: 43
average 413.000 
maxdeviation:
 0.97% =  4.000
 0.73% =  3.000
 0.48% =  2.000
swaps: 2

## Trial 9
Pictures: 33
average 342.000 
maxdeviation:
 0.29% =  1.000
swaps: 0

## Trial 10
Pictures: 26
average 233.333 
maxdeviation:
 2.29% =  5.333
 1.86% =  4.333
 1.43% =  3.333
 1.00% =  2.333
 0.57% =  1.333
swaps: 4

回答2:

This is the offline makespan minimisation problem, which I think is equivalent to the multiprocessor scheduling problem. Instead of jobs you have images, and instead of job durations you have image heights, but it's exactly the same problem. (The fact that it involves space instead of time doesn't matter.) So any algorithm that (approximately) solves either of them will do.

回答3:

Here's an algorithm (called First Fit Decreasing) that will get you a very compact arrangement, in a reasonable amount of time. There may be a better algorithm but this is ridiculously simple.

Sort the images in order from tallest to shortest.
Take the first image, and place it in the shortest column. (If multiple columns are the same height (and shortest) pick any one.)
Repeat step 2 until no images remain.

When you're done, you can re-arrange the elements in the each column however you choose if you don't like the tallest-to-shortest look.

回答4:

Here's one:

 // Create initial solution
 <run First Fit Decreasing algorithm first>
 // Calculate "error", i.e. maximum height difference
 // after running FFD
 err = (maximum_height - minimum_height)
 minerr = err

 // Run simple greedy optimization and random search
 repeat for a number of steps: // e.g. 1000 steps
    <find any two random images a and b from two different columns such that
     swapping a and b decreases the error>
    if <found>:
         swap a and b
         err = (maximum_height - minimum_height)
         if (err < minerr):
              <store as best solution so far> // X
    else:
         swap two random images from two columns
         err = (maximum_height - minimum_height)

 <output the best solution stored on line marked with X>

来源：https://stackoverflow.com/questions/5626520/fitting-n-variable-height-images-into-3-similar-length-column-layout

标签

algorithm

layout

scheduling

packing