how to sort word count by value in hadoop? [duplicate]

北慕城南 提交于 2019-11-30 01:50:33

问题


hi i wanted to learn how to sort the word count by value in hadoop.i know hadoop takes of sorting keys, but not by values.

i know to sort the values we must have a partitioner,groupingcomparator and a sortcomparator

but i am bit confused in applying these concepts together to sort the word count by value.

do we need another map reduce job to achieve the same or else a combiner to count the occurrences and then sort here and emit the same to reducer?

can any one explain how to sort word count example by values?


回答1:


You need to have a second mapreduce job. Unless you conclude on the the totals counts (which the first MR job does) how can you think of sorting by value (the counts of the words)? Logically not possible.




回答2:


This is called as secondary sort. See this and this for details.



来源:https://stackoverflow.com/questions/18403857/how-to-sort-word-count-by-value-in-hadoop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!