发表新帖

发表新帖

Number of reducers in hadoop

后端未结

关注

 4  1306

没有蜡笔的小新 2021-02-20 10:35

I was learning hadoop, I found number of reducers very confusing :

1) Number of reducers is same as number of partitions.

2) Number of reducers is 0.95 or 1.75 m

4条回答

醉酒成梦 (楼主)

2021-02-20 11:06

Number of reducer is internally calculated from size of the data we are processing if you don't explicitly specify using below API in driver program

job.setNumReduceTasks(x)

By default on 1 GB of data one reducer would be used.

so if you are playing with less than 1 GB of data and you are not specifically setting the number of reducer so 1 reducer would be used .

Similarly if your data is 10 Gb so 10 reducer would be used .

You can change the configuration as well that instead of 1 GB you can specify the bigger size or smaller size.

property in hive for setting size of reducer is :

hive.exec.reducers.bytes.per.reducer

you can view this property by firing set command in hive cli.

Partitioner only decides which data would go to which reducer.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

热议问题