How to increase the mappers and reducers in hadoop according to number of instances used to increase the performance?

江枫思渺然 提交于 2019-11-29 08:58:11

Changing number of mappers - is pure optimization which should not affect results. You should set number to fully utilize your cluster (if it is dedicated). Try number of mappers per node equal to number of cores. Look on CPU utilization, and increase the number until you get almost full CPU utilization or, you system start swapping. It might happens that you need less mappers then cores, if you have not enough memory.
Number of reducers impacts results so , if you need specific number of reducer (like 1) - set it
If you can handle results of any number of reducers - do the same optimization as with Mappers.
Theoretically you can became IO bound during this tuning process - pay attention to this also when tuning number of tasks. You can recognieze it by low CPU utilization despite increase of mappers / reducers count.

You can increase number of mappers based on the block size and split size. One of the easiest way is to decrease the split size as shown below:

Configuration conf= new Cofiguration();
//set the value that increases your number of splits.
conf.set("mapred.max.split.size", "1020");
Job job = new Job(conf, "My job name");

I have tried the suggestion from @Animesh Raj Jha by modifying mapred.max.split.size and got a noticeable performance increase.

i am using hadoop 2.2, and don't know how to set max input split size I would like to decrease this value, in order to create more mappers I tried updating yarn-site.xml, and but it does not work

indeed, hadoop 2.2 /yarn does not take of none the following settings

<property>
<name>mapreduce.input.fileinputformat.split.minsize</name>
<value>1</value>
</property>
<property>
<name>mapreduce.input.fileinputformat.split.maxsiz e</name>
<value>16777216</value>
</property>

<property>
<name>mapred.min.split.size</name>
<value>1</value>
</property>
<property>
<name>mapred.max.split.size</name>
<value>16777216</value>
</property>

best

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!