Algorithm (prob. solving) achieving fastest runtime

后端 未结 4 1991
广开言路
广开言路 2021-01-31 11:19

For an algorithm competition training (not homework) we were given this question from a past year. Posted it to this site because the other site required a login.

This i

相关标签:
4条回答
  • 2021-01-31 11:38

    Here is a less mathematically inclined solution that works in O(n).

    Let us partition the houses (indexing starts at 0) into two disjoints sets:

    • F, "front", where people walk CCW to the house
    • B, "back", where people walk CW to the house

    and a single house p that marks the current position where the plant would be built.

    I have based my illustration on the example given in the image.

    By convention, lets assign half the houses to F, and exactly one less to B.

    • F contains 6 houses
    • B contains 5 houses

    With simple modular arithmetic, we can easily access the houses by (p + offset) % 12 thanks to Python's sane implementation of the modulo operator, quite unlike some other popular languages.

    If we arbitrarily choose a position for p, we can determine the consumption of water in O(L) trivially.

    We could do this all over again for a different position of p to arrive at a runtime of O(L^2).

    However, if we only shift p by one position, we can determine the new consumption in O(1) if we make a somewhat clever observation: The amount of people living in F (or B respectively) determine how much the consumption of F changes when we set p' = p+1. (and some corrections because F itself will change). I have tried to depict this here to the best of my abilities.

    algorithm depiction

    We end up with a total running time of O(L).

    The program for this algorithm is at the end of the post.

    But we can do better. As long as no houses change between the sets, the cs and ws that are added will be zero. We can calculate how many of these steps there are and do them in one step.

    Houses change sets when: - When p is on a house - When p is opposite of a house

    In the following diagram, I have visualized stops the algorithm now makes to update the Cs and Ws. Highlighted is the house, that causes the algorithm to stop.

    optimized algorithm

    The algorithm begins at a house (or the opposite of one, we'll see why later), in this case that happens to be a house.

    Again, we have both a consumption C(B) = 3*1 and C(F) = 2 * 1. If we shift p to the right by one, we add 4 to C(B) and subtract 1from C(F). If we shift p once again, the exact same thing happens.

    As long as the same two sets of houses move closer and further away respectively, the changes to the Cs are constant.

    We now change the definition of B slightly: It will now also contain p! (This does not change the above paragraphs regarding this optimized version of the algorithm).

    This is done because when we move to the next step, we will add the weight of the houses that are moving away repeatedly. The house at the current position is moving away when p moves to the right, thus W(B) is the correct summand.

    The other case is when a house stops moving away and comes closer again. In that case the Cs change drastically because 6*weight goes from one C to the other. That is the other case when we need to stop.

    new calculations

    I hope it is clear how and why this works, so I'll just leave the working algorithm here. Please ask if something is not clear.

    O(n):

    import itertools
    
    def hippo_island(houses, L):
        return PlantBuilder(houses, L).solution
    
    class PlantBuilder:
        def __init__(self, houses, L):
            self.L = L
            self.houses = sorted(houses)
            self.changes = sorted(
                [((pos + L /2) % L, -transfer) for pos, transfer in self.houses] + 
                self.houses)
            self.starting_position = min(self.changes)[0]
    
            def is_front(pos_population):
                pos = pos_population[0]
                pos += L if pos < self.starting_position else 0
                return self.starting_position < pos <= self.starting_position + L // 2
    
            front_houses = filter(is_front, self.houses)
            back_houses = list(itertools.ifilterfalse(is_front, self.houses))
    
            self.front_count = len(houses) // 2
            self.back_count = len(houses) - self.front_count - 1
            (self.back_weight, self.back_consumption) = self._initialize_back(back_houses)
            (self.front_weight, self.front_consumption) = self._initialize_front(front_houses)
            self.solution = (0, self.back_weight + self.front_weight)
            self.run()
    
        def distance(self, i, j):
            return min((i - j) % self.L, self.L - (i - j) % self.L)
    
        def run(self):
            for (position, weight) in self.consumptions():
                self.update_solution(position, weight)
    
        def consumptions(self):
            last_position = self.starting_position
            for position, transfer in self.changes[1:]:
                distance = position - last_position
                self.front_consumption -= distance * self.front_weight
                self.front_consumption += distance * self.back_weight
    
                self.back_weight += transfer
                self.front_weight -= transfer
    
                # We are opposite of a house, it will change from B to F
                if transfer < 0:
                    self.front_consumption -= self.L/2 * transfer
                    self.front_consumption += self.L/2 * transfer
    
    
                last_position = position
                yield (position, self.back_consumption + self.front_consumption)
    
        def update_solution(self, position, weight):
            (best_position, best_weight) = self.solution
            if weight > best_weight:
                self.solution = (position, weight)
    
        def _initialize_front(self, front_houses):
            weight = 0
            consumption = 0
            for position, population in front_houses:
                distance = self.distance(self.starting_position, position)
                consumption += distance * population
                weight += population
            return (weight, consumption)
    
        def _initialize_back(self, back_houses):
            weight = back_houses[0][1]
            consumption = 0
            for position, population in back_houses[1:]:
                distance = self.distance(self.starting_position, position)
                consumption += distance * population
                weight += population
            return (weight, consumption)
    

    O(L)

    def hippo_island(houses):
        return PlantBuilder(houses).solution
    
    class PlantBuilder:
        def __init__(self, houses):
            self.houses = houses
            self.front_count = len(houses) // 2
            self.back_count = len(houses) - self.front_count - 1
            (self.back_weight, self.back_consumption) = self.initialize_back()
            (self.front_weight, self.front_consumption) = self.initialize_front()
            self.solution = (0, self.back_weight + self.front_weight)
            self.run()
    
        def run(self):
            for (position, weight) in self.consumptions():
                self.update_solution(position, weight)
    
        def consumptions(self):
            for position in range(1, len(self.houses)):
                self.remove_current_position_from_front(position)
    
                self.add_house_furthest_from_back_to_front(position)
                self.remove_furthest_house_from_back(position)
    
                self.add_house_at_last_position_to_back(position)
                yield (position, self.back_consumption + self.front_consumption)
    
        def add_house_at_last_position_to_back(self, position):
            self.back_weight += self.houses[position - 1]
            self.back_consumption += self.back_weight
    
        def remove_furthest_house_from_back(self, position):
            house_position = position - self.back_count - 1
            distance = self.back_count
            self.back_weight -= self.houses[house_position]
            self.back_consumption -= distance * self.houses[house_position]
    
        def add_house_furthest_from_back_to_front(self, position):
            house_position = position - self.back_count - 1
            distance = self.front_count
            self.front_weight += self.houses[house_position]
            self.front_consumption += distance * self.houses[house_position]
    
        def remove_current_position_from_front(self, position):
            self.front_consumption -= self.front_weight
            self.front_weight -= self.houses[position]
    
        def update_solution(self, position, weight):
            (best_position, best_weight) = self.solution
            if weight > best_weight:
                self.solution = (position, weight)
    
        def initialize_front(self):
            weight = 0
            consumption = 0
            for distance in range(1, self.front_count + 1):
                consumption += distance * self.houses[distance]
                weight += self.houses[distance]
            return (weight, consumption)
    
        def initialize_back(self):
            weight = 0
            consumption = 0
            for distance in range(1, self.back_count + 1):
                consumption += distance * self.houses[-distance]
                weight += self.houses[-distance]
            return (weight, consumption)
    

    Result:

    >>> hippo_island([0, 3, 0, 1, 0, 0, 0, 0, 0, 0, 1, 2])
    (7, 33)
    
    0 讨论(0)
  • 2021-01-31 11:39

    Suppose the list houses is composed of pairs (x,pop) with 0 <= x < 4*L the location and pop the population.

    The objective function, which we want to maximize, is

    def revenue(i):
        return sum(pop * min((i-j)%(4*L), 4*L - (i-j)%(4*L)) for j,pop in houses)
    

    The naive algorithm O(LN) algorithm is simply:

    max_revenue = max(revenue(i) for i in range(4*L))
    

    But it is incredibly wasteful to entirely re-evaluate revenue for each location.

    To avoid that, notice that this is a piecewise-linear function; so its derivative is piecewise constant, with discontinuities at two kinds of points:

    • at house i, the derivative changes from slope to slope + 2*population[i]
    • at the point located opposite house i on the island, the derivative changes from slope to slope - 2*population[i]

    This makes things very simple:

    1. We only have to examine actual houses or opposite-of-houses, so the complexity drops to O(N²).
    2. We know how to update the slope from house i-1 to house i, and it requires only O(1) time.
    3. Since we know the revenue and the slope at location 0, and since we know how to update the slope iteratively, the complexity actually drops to O(N): between two consecutive houses/opposite-of-houses, we can just multiply the slope by the distance to obtain the difference in revenue.

    So the complete algorithm is:

    def algorithm(houses, L):
        def revenue(i):
            return sum(pop * min((i-j)%(4*L), 4*L - (i-j)%(4*L)) for j,pop in houses)
    
        slope_changes = sorted(
                [(x, 2*pop) for x,pop in houses] +
                [((x+2*L)%(4*L), -2*pop) for x,pop in houses])
    
        current_x = 0
        current_revenue = revenue(0)
        current_slope = current_revenue - revenue(4*L-1)
        best_revenue = current_revenue
    
        for x, slope_delta in slope_changes:
            current_revenue += (x-current_x) * current_slope
            current_slope += slope_delta
            current_x = x
            best_revenue = max(best_revenue, current_revenue)
    
        return best_revenue
    

    To keep things simple I used sorted() to merge the two types of slope changes, but this is not optimal as it has O(N log N) complexity. If you want better efficiency, you can generate in O(N) time a sorted list corresponding to the opposite-of-houses, and merge it with the list of houses in O(N) (e.g. with the standard library's heapq.merge). You could also stream from iterators instead of lists if you want to minimize memory usage.

    TLDR: this solution achieves the lowest feasible complexity of O(N).

    0 讨论(0)
  • 2021-01-31 11:39

    I'll provide some tips so that you can still have some challenge for yourself.


    Let me start with a heavily simplified version:

    There are N houses on a straight street, and either is either populated or empty.

    0 1 1 0 1
    

    Let's calculate the score for them, knowing that the n-th house has a score equal to the sum of all distances to other houses which are non-empty. So the score of first house is 1+2+4 = 7, since there are 3 other populated houses and they are in distances 1, 2, 4.

    The full array of scores looks like:

    7 4 3 4 5
    

    How to calculate that? The obvious approach is...

    for every house i
        score(i) = 0
        for every other house j
            if j is populated, score(i) += distance(i, j)
    

    This gives you O(N^2) complexity. However there's a quicker way that calculates all the scores in O(N), because it doesn't feature a nested loop. It's related to prefix sums. Can you find it?

    0 讨论(0)
  • 2021-01-31 11:41

    There is no need to calculate every house!!!

    It's not fully developed, but I think it is worth thinking about it:

    modulo N

    N is the number of all houses, n shall be the "adress" (number) of some of the houses.

    If you walk around the island, you will find that n is raising by 1 for each house you pass. If you reach the house where n is N, then the next house has the number 1.

    Let us use a different system of numbering: increase every house-number by 1. Then n goes from 0 to N-1. this is the same way how numbers modulo N will behave.

    Litres is a function of the house-number n (modulo N)

    You can calculate the amount of liters for each house-Number by building the sum of all products of distance and people living there.

    You can also draw a graph of that function: x is n, y is the number of litres.

    the function is periodic

    If you understand what modulo means, you will understand that the graph you just did draw is just one periode of a periodic function, since Litre(n) ist eaqual to Litre(n + x * N) where x is a integer (that might be negative too).

    if N is big, the function is "Pseudo-continuous"

    What I mean is this: If N is really big, then the amount of litres will not change very much if you move from house a to its neighbour, house a+1. So you can use methods of interpolation.

    you are looking for the place of the "global" maximum of a periodic pseudo-continuous function (only really global within one periode)

    This is my suggestion:

    Step 1: select a distance d that is bigger than 1 and smaller than N. I can't say why, but I would use d=int(sqrt(N)) (maybee there might be a better choice, try it out).
    Step 2: calculate the Litres for House 0, d, 2d, 3d, ... Step 3: you will find some values that are higher than both of their neighbours. Use this high-points and their neighbours feed them using a method of interpolation to calculate more points close to that a high points (interval-splitting).

    Repeat this interpolations for other high points as long as you have time (you have 1 second, which is a long time!)

    Jump from one high-point to another if you see, that the global maximum must be elsewhere.

    0 讨论(0)
提交回复
热议问题