问题
Problem The median of M numbers is defined as the 1) if M is odd middle number after sorting them in order 2) if M is even the average number of the middle 2 numbers (again after sorting) You have an empty number list at first. Then you can add or remove some number from the list. For each add or remove operation, output the median of numbers in the list.
Example : For a set of m = 5 numbers, { 9, 2, 8, 4, 1 } the median is the third number in sorted set { 1, 2, 4, 8, 9 } which is 4. Similarly for set of m = 4, { 5, 2, 10, 4 }, the median is the average of second and the third element in the sorted set { 2, 4, 5, 10 } which is (4+5)/2 = 4.5
My approach I think the problem can be solved in this way.. Idea is to use previous median value and pointer to find new median value instead of recalculating at every add or remove operation.
1) Use multisets which always keep elements in order and allow duplicates. In other words maintain sorted list somehow.
2) If the operation is add
2.1) Insert this element into set and then calculate the median
2.2) if the size of set is 1 then first element will be the median
2.3) if the size of set is even, then
if new element is larger then prev median, new median will be avg of prev median
and the next element in set.
else new median will be avg of prev median and previous of prev element in the set.
2.4) if the size is odd, then
if new element is larger then prev median
if also less then 2nd element of prev median ( 2nd element used to calculate avg
of prev median) then this new element to be added will be new median
else median will be 2nd element use to calculate the avg during last iteration prev
median.
else
new median will be previous of prev median element in the set
3) If the operation is remove
3.1) First calculate the new median
3.2) If the size of set is 0 can't remove
3.3) If the size is 1 if the first element is the element to be removed, remove it else can't remove.
3.4) If the size of set is even, then
if the element to be deleted is greater than or equal to 2nd element of prev median, then
1st element of prev median will be new median
else 2nd element of prev median will be the new median
3.5) If the size of set is odd, then
if the element to be deleted is the prev median then find the avg of its prev and next element.
else if the element to be deleted is greater then prev median, new median will be avg of prev median and previous to prev median
else median will be avg of prev median and next element to prev median.
3.6) Remove the element.
Here is the working code ...http://justprogrammng.blogspot.com/2012/06/interviewstreet-median-challenge.html. What are your views on this approach?
回答1:
Your approach seems like it could work, but from the description and the code, you can tell that there is a lot of casework involved. I wouldn't like to be the one having to debug that! So let me give you an alternate solution that should involve less cases, and therefore be much simpler to get right.
Keep two multisets (this algorithm also works with two priority queues, as we're only going to look at the extremes of each one). The first, minset
, is going to keep the smallest n/2 numbers, and the second, maxset
, is going to store the last n/2 numbers.
Whenever you add a number:
- If it is greater than
max(minset)
, add it tomaxset
- Otherwise, add it to
minset
Note that this doesn't guarantee the n/2 condition. Therefore, we should add one extra "fixing" step:
- If
maxset.size() > minset.size()
, remove the smallest element frommaxset
and insert it tominset
. - If
minset.size() > minset.size() + 1
, remove the biggest element fromminset
and insert it tomaxset
.
After this is done, we just have to get the median. This should be really easy to do with our data structure: depending on whether the current n is even or odd, it's either max(minset)
or the average between max(minset)
and min(maxset)
.
For the removal operation, just try to remove it from any of the sets and do the fixing afterwards.
回答2:
The main issue with your code is the comparison of each new item with the running median, which might be a calculated average value. Instead you should compare the new item with the value at the previous middle (*prev
in your code). At it is, after receiving the sequence of 1 and 5, your median value will be 3. If the next value is 2 or 4 it should become the new median, but since your code follows a different path for each of those, one of the results is wrong.
It would be simpler overall to just keep track of the middle location and not the running median. Instead, calculate the median at the end of each add/remove operation:
if size == 0
median = NaN
else if size is odd
median = *prev
else
median = (*prev + *(prev-1)) / 2
回答3:
I think you can try to verify two cases:
1) negative number
4
a -1
a 0
a 0
r 0
2) two big integer whose sum will exceed max int
回答4:
If your list is sorted, then you can calculate the median in constant time with a method similar to the following pseudo-code
if list.length % 2 == 0
median = (list[list.length/2 - 1] + list[list.length/2]) / 2
else
median = list[list.length/2]
Therefore, just maintain a sorted list on every insert/remove. You can do these operations in O(n)
time by stepping through the list until you are between an element that is < the added element and one that is >= the added element. You can actually do these insert/removes in O(log n)
time if you start in the middle of the list then decide if your element is less than or greater than the middle element. Take that half-list and start in the middle of that and repeat.
Your problem doesn't state what the performance requirements are for this but the entire thing cannot always happen in constant time as far as I am aware. This implementation has the following performance
Insert O(log n)
Remove O(log n)
Median O(1)
回答5:
This code solves the median challenge on interviewStreet.
# this code solves the median challenge on interviewStreet.
# logic is simple. insert the numbers into a sorted sequence in place.
# use bisection to find the insert index(O(logn)). keep a count of no. of elements in
# the list and print the median using it(O(1)).
!/bin/python
from bisect import bisect_left
List = []
nnode = 0
def printMed():
if nnode>0:
if nnode%2 == 0 :
if (0.5*(List[nnode/2]+List[(nnode/2)-1])).is_integer():
print int(0.5*(List[nnode/2]+List[(nnode/2)-1]))
else:
print 0.5*(List[nnode/2]+List[(nnode/2)-1])
else:
print List[nnode/2]
else:
print "Wrong!"
def rem(val):
global nnode
try:
List.remove(val)
except:
print "Wrong!"
else:
nnode = nnode-1
printMed()
if __name__ == "__main__":
n = int(raw_input())
for i in range(0,n):
l = raw_input().split()
if(l[0] == 'r'):
rem(int(l[1]))
else:
index = bisect_left(List , int(l[1])) ;
List.insert(index ,int(l[1]))
nnode = nnode+1
printMed()
回答6:
This is the solution for median challenge in java using collections.sort(list)
import java.util.*;
public class SolutionMedian{
ArrayList<Integer> sortedList = new ArrayList<Integer>();
public static void main(String args[]){
SolutionMedian m = new SolutionMedian();
Scanner in = new Scanner(System.in);
int n = in.nextInt();
char[] op = new char[n];
int[] val = new int[n];
for(int i=0; i<n; i++){
op[i] = in.next().charAt(0);
val[i] = in.nextInt();
}
for(int i=0; i<n; i++)
if(op[i] == 'a')
m.add(val[i]);
else
m.remove(val[i]);
}
void add(int val){
sortedList.add(val);
getMedian();
}
void remove(int val){
int index = sortedList.indexOf(val);
if(index>=0){
sortedList.remove(index);
getMedian();
}else{
System.out.println("Wrong!");
}
}
void getMedian(){
Collections.sort(sortedList);
int size = sortedList.size();
switch(size){
case 0:
System.out.println("Wrong!");
break;
case 1:
System.out.println(sortedList.get(0));
break;
default:
if(size%2 == 0) {//even size
int halfIndex = size/2;
long sum = sortedList.get(halfIndex)
+ sortedList.get(halfIndex-1);
if(1==(sum&1))
System.out.println((sum/2)+".5");
else
System.out.println(sum/2);
}else{//odd size
System.out.println(sortedList.get((size-1)/2));
}
}
}
}
来源:https://stackoverflow.com/questions/11007406/interviewstreet-median-challenge