Number of reducers in hadoop

后端 未结 4 1287
没有蜡笔的小新
没有蜡笔的小新 2021-02-20 10:35

I was learning hadoop, I found number of reducers very confusing :

1) Number of reducers is same as number of partitions.

2) Number of reducers is 0.95 or 1.75 m

4条回答
  •  暗喜
    暗喜 (楼主)
    2021-02-20 10:55

    1 - The number of reducers is as number of partitions - False. A single reducer might work on one or more partitions. But a chosen partition will be fully done on the reducer it is started.

    2 - That is just a theoretical number of maximum reducers you can configure for a Hadoop cluster. Which is very much dependent on the kind of data you are processing too (decides how much heavy lifting the reducers are burdened with).

    3 - The mapred-site.xml configuration is just a suggestion to the Yarn. But internally the ResourceManager has its own algorithm running, optimizing things on the go. So that value is not really the number of reducer tasks running every time.

    4 - This one seems a bit unrealistic. My block size might 128MB and everytime I can't have 128*5 minimum number of reducers. That's again is false, I believe.

    There is no fixed number of reducers task that can be configured or calculated. It depends on the moment how much of the resources are actually available to allocate.

提交回复
热议问题