For an algorithm competition training (not homework) we were given this question from a past year. Posted it to this site because the other site required a login.
This i
Here is a less mathematically inclined solution that works in O(n)
.
Let us partition the houses (indexing starts at 0) into two disjoints sets:
F
, "front", where people walk CCW to the houseB
, "back", where people walk CW to the houseand a single house p
that marks the current position where the plant would be built.
I have based my illustration on the example given in the image.
By convention, lets assign half the houses to F
, and exactly one less to B
.
F
contains 6 housesB
contains 5 housesWith simple modular arithmetic, we can easily access the houses by (p + offset) % 12
thanks to Python's sane implementation of the modulo operator, quite unlike some other popular languages.
If we arbitrarily choose a position for p
, we can determine the consumption of water in O(L)
trivially.
We could do this all over again for a different position of p
to arrive at a runtime of O(L^2)
.
However, if we only shift p
by one position, we can determine the new consumption in O(1)
if we make a somewhat clever observation: The amount of people living in F
(or B
respectively) determine how much the consumption of F
changes when we set p' = p+1
. (and some corrections because F
itself will change). I have tried to depict this here to the best of my abilities.
We end up with a total running time of O(L)
.
The program for this algorithm is at the end of the post.
But we can do better. As long as no houses change between the sets, the c
s and w
s that are added will be zero. We can calculate how many of these steps there are and do them in one step.
Houses change sets when:
- When p
is on a house
- When p
is opposite of a house
In the following diagram, I have visualized stops the algorithm now makes to update the C
s and W
s.
Highlighted is the house, that causes the algorithm to stop.
The algorithm begins at a house (or the opposite of one, we'll see why later), in this case that happens to be a house.
Again, we have both a consumption C(B) = 3*1
and C(F) = 2 * 1
. If we shift p
to the right by one, we add 4
to C(B)
and subtract 1
from C(F)
. If we shift p
once again, the exact same thing happens.
As long as the same two sets of houses move closer and further away respectively, the changes to the C
s are constant.
We now change the definition of B
slightly: It will now also contain p
! (This does not change the above paragraphs regarding this optimized version of the algorithm).
This is done because when we move to the next step, we will add the weight of the houses that are moving away repeatedly. The house at the current position is moving away when p
moves to the right, thus W(B)
is the correct summand.
The other case is when a house stops moving away and comes closer again. In that case the C
s change drastically because 6*weight
goes from one C
to the other. That is the other case when we need to stop.
I hope it is clear how and why this works, so I'll just leave the working algorithm here. Please ask if something is not clear.
O(n):
import itertools
def hippo_island(houses, L):
return PlantBuilder(houses, L).solution
class PlantBuilder:
def __init__(self, houses, L):
self.L = L
self.houses = sorted(houses)
self.changes = sorted(
[((pos + L /2) % L, -transfer) for pos, transfer in self.houses] +
self.houses)
self.starting_position = min(self.changes)[0]
def is_front(pos_population):
pos = pos_population[0]
pos += L if pos < self.starting_position else 0
return self.starting_position < pos <= self.starting_position + L // 2
front_houses = filter(is_front, self.houses)
back_houses = list(itertools.ifilterfalse(is_front, self.houses))
self.front_count = len(houses) // 2
self.back_count = len(houses) - self.front_count - 1
(self.back_weight, self.back_consumption) = self._initialize_back(back_houses)
(self.front_weight, self.front_consumption) = self._initialize_front(front_houses)
self.solution = (0, self.back_weight + self.front_weight)
self.run()
def distance(self, i, j):
return min((i - j) % self.L, self.L - (i - j) % self.L)
def run(self):
for (position, weight) in self.consumptions():
self.update_solution(position, weight)
def consumptions(self):
last_position = self.starting_position
for position, transfer in self.changes[1:]:
distance = position - last_position
self.front_consumption -= distance * self.front_weight
self.front_consumption += distance * self.back_weight
self.back_weight += transfer
self.front_weight -= transfer
# We are opposite of a house, it will change from B to F
if transfer < 0:
self.front_consumption -= self.L/2 * transfer
self.front_consumption += self.L/2 * transfer
last_position = position
yield (position, self.back_consumption + self.front_consumption)
def update_solution(self, position, weight):
(best_position, best_weight) = self.solution
if weight > best_weight:
self.solution = (position, weight)
def _initialize_front(self, front_houses):
weight = 0
consumption = 0
for position, population in front_houses:
distance = self.distance(self.starting_position, position)
consumption += distance * population
weight += population
return (weight, consumption)
def _initialize_back(self, back_houses):
weight = back_houses[0][1]
consumption = 0
for position, population in back_houses[1:]:
distance = self.distance(self.starting_position, position)
consumption += distance * population
weight += population
return (weight, consumption)
O(L)
def hippo_island(houses):
return PlantBuilder(houses).solution
class PlantBuilder:
def __init__(self, houses):
self.houses = houses
self.front_count = len(houses) // 2
self.back_count = len(houses) - self.front_count - 1
(self.back_weight, self.back_consumption) = self.initialize_back()
(self.front_weight, self.front_consumption) = self.initialize_front()
self.solution = (0, self.back_weight + self.front_weight)
self.run()
def run(self):
for (position, weight) in self.consumptions():
self.update_solution(position, weight)
def consumptions(self):
for position in range(1, len(self.houses)):
self.remove_current_position_from_front(position)
self.add_house_furthest_from_back_to_front(position)
self.remove_furthest_house_from_back(position)
self.add_house_at_last_position_to_back(position)
yield (position, self.back_consumption + self.front_consumption)
def add_house_at_last_position_to_back(self, position):
self.back_weight += self.houses[position - 1]
self.back_consumption += self.back_weight
def remove_furthest_house_from_back(self, position):
house_position = position - self.back_count - 1
distance = self.back_count
self.back_weight -= self.houses[house_position]
self.back_consumption -= distance * self.houses[house_position]
def add_house_furthest_from_back_to_front(self, position):
house_position = position - self.back_count - 1
distance = self.front_count
self.front_weight += self.houses[house_position]
self.front_consumption += distance * self.houses[house_position]
def remove_current_position_from_front(self, position):
self.front_consumption -= self.front_weight
self.front_weight -= self.houses[position]
def update_solution(self, position, weight):
(best_position, best_weight) = self.solution
if weight > best_weight:
self.solution = (position, weight)
def initialize_front(self):
weight = 0
consumption = 0
for distance in range(1, self.front_count + 1):
consumption += distance * self.houses[distance]
weight += self.houses[distance]
return (weight, consumption)
def initialize_back(self):
weight = 0
consumption = 0
for distance in range(1, self.back_count + 1):
consumption += distance * self.houses[-distance]
weight += self.houses[-distance]
return (weight, consumption)
Result:
>>> hippo_island([0, 3, 0, 1, 0, 0, 0, 0, 0, 0, 1, 2])
(7, 33)
Suppose the list houses
is composed of pairs (x,pop)
with 0 <= x < 4*L
the location and pop
the population.
The objective function, which we want to maximize, is
def revenue(i):
return sum(pop * min((i-j)%(4*L), 4*L - (i-j)%(4*L)) for j,pop in houses)
The naive algorithm O(LN) algorithm is simply:
max_revenue = max(revenue(i) for i in range(4*L))
But it is incredibly wasteful to entirely re-evaluate revenue
for each location.
To avoid that, notice that this is a piecewise-linear function; so its derivative is piecewise constant, with discontinuities at two kinds of points:
i
, the derivative changes from slope
to slope + 2*population[i]
i
on the island, the derivative changes from slope
to slope - 2*population[i]
This makes things very simple:
slope
from house i-1
to house i
, and it requires only O(1) time.slope
iteratively, the complexity actually drops to O(N): between two consecutive houses/opposite-of-houses, we can just multiply the slope by the distance to obtain the difference in revenue.So the complete algorithm is:
def algorithm(houses, L):
def revenue(i):
return sum(pop * min((i-j)%(4*L), 4*L - (i-j)%(4*L)) for j,pop in houses)
slope_changes = sorted(
[(x, 2*pop) for x,pop in houses] +
[((x+2*L)%(4*L), -2*pop) for x,pop in houses])
current_x = 0
current_revenue = revenue(0)
current_slope = current_revenue - revenue(4*L-1)
best_revenue = current_revenue
for x, slope_delta in slope_changes:
current_revenue += (x-current_x) * current_slope
current_slope += slope_delta
current_x = x
best_revenue = max(best_revenue, current_revenue)
return best_revenue
To keep things simple I used sorted()
to merge the two types of slope changes, but this is not optimal as it has O(N log N) complexity. If you want better efficiency, you can generate in O(N) time a sorted list corresponding to the opposite-of-houses, and merge it with the list of houses in O(N) (e.g. with the standard library's heapq.merge
). You could also stream from iterators instead of lists if you want to minimize memory usage.
TLDR: this solution achieves the lowest feasible complexity of O(N).
I'll provide some tips so that you can still have some challenge for yourself.
Let me start with a heavily simplified version:
There are N houses on a straight street, and either is either populated or empty.
0 1 1 0 1
Let's calculate the score for them, knowing that the n-th house has a score equal to the sum of all distances to other houses which are non-empty. So the score of first house is 1+2+4 = 7
, since there are 3 other populated houses and they are in distances 1, 2, 4.
The full array of scores looks like:
7 4 3 4 5
How to calculate that? The obvious approach is...
for every house i
score(i) = 0
for every other house j
if j is populated, score(i) += distance(i, j)
This gives you O(N^2) complexity. However there's a quicker way that calculates all the scores in O(N), because it doesn't feature a nested loop. It's related to prefix sums. Can you find it?
There is no need to calculate every house!!!
It's not fully developed, but I think it is worth thinking about it:
modulo N
N is the number of all houses, n shall be the "adress" (number) of some of the houses.
If you walk around the island, you will find that n is raising by 1 for each house you pass. If you reach the house where n is N, then the next house has the number 1.
Let us use a different system of numbering: increase every house-number by 1. Then n goes from 0 to N-1. this is the same way how numbers modulo N will behave.
Litres is a function of the house-number n (modulo N)
You can calculate the amount of liters for each house-Number by building the sum of all products of distance and people living there.
You can also draw a graph of that function: x is n, y is the number of litres.
the function is periodic
If you understand what modulo means, you will understand that the graph you just did draw is just one periode of a periodic function, since Litre(n) ist eaqual to Litre(n + x * N) where x is a integer (that might be negative too).
if N is big, the function is "Pseudo-continuous"
What I mean is this: If N is really big, then the amount of litres will not change very much if you move from house a to its neighbour, house a+1. So you can use methods of interpolation.
you are looking for the place of the "global" maximum of a periodic pseudo-continuous function (only really global within one periode)
This is my suggestion:
Step 1: select a distance d that is bigger than 1 and smaller than N. I can't say why, but I would use d=int(sqrt(N)) (maybee there might be a better choice, try it out).
Step 2: calculate the Litres for House 0, d, 2d, 3d, ...
Step 3: you will find some values that are higher than both of their neighbours. Use this high-points and their neighbours feed them using a method of interpolation to calculate more points close to that a high points (interval-splitting).
Repeat this interpolations for other high points as long as you have time (you have 1 second, which is a long time!)
Jump from one high-point to another if you see, that the global maximum must be elsewhere.