Ability to limit maximum reducers for a hadoop hive mapred job?

依然范特西╮ 提交于 2020-01-25 01:08:12

问题


I've tried prepending my query with:

set mapred.running.reduce.limit = 25;

And

 set hive.exec.reducers.max = 35;

The last one jailed a job with 530 reducers down to 35... which makes me think it was going to try and shoe horn 530 reducers worth of work into 35.

Now giving

set mapred.tasktracker.reduce.tasks.maximum = 3;

a try to see if that number is some sort of max per node ( previously was 7 on a cluster with 70 potential reducer's ).

Update:

 set mapred.tasktracker.reduce.tasks.maximum = 3;

Had no effect, was worth a try though.


回答1:


Not exactly a solution to the question, but potentially a good compromise.

set hive.exec.reducers.max = 45;

For a super query of doom that has 400+ reducers, this jails the most expensive hive task down to 35 reducers total. My cluster currently only has 10 nodes, each node supporting 7 reducers...so in reality only 70 reducers can run as one time. By jailing the job down to less then 70, I've noticed a slight improvement in speed without any visible changes to the final product. Testing this in production to figure out what exactly is going on here. In the interim it's a good compromise solution.



来源:https://stackoverflow.com/questions/4924674/ability-to-limit-maximum-reducers-for-a-hadoop-hive-mapred-job

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!