Computing median in map reduce
问题 Can someone example the computation of median/quantiles in map reduce? My understanding of Datafu's median is that the 'n' mappers sort the data and send the data to "1" reducer which is responsible for sorting all the data from n mappers and finding the median(middle value) Is my understanding correct?, if so, does this approach scale for massive amounts of data as i can clearly see the one single reducer struggling to do the final task. Thanks 回答1: Trying to find the median (middle number)