I am trying to execute a map reduce program on Hadoop.
When i submit my job to the hadoop single node cluster. The job is getting created but failing with the messag
Include below properties in yarn-site.xml
and restart VM
,
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
A container is a yarn JVM process. In Mapreduce the application master service, mapper and reducer tasks are all containers that execute inside the yarn framework.
You can fix this issue by either increase the number of reducers ( say mapreduce.job.reduces=10
) or by increasing the reduce heap size ( mapreduce.reduce.java.opts=-Xmx2014m
)
If you would want to have fixed number of reducer at run time, you can do it while passing the Map/Reduce job at command line. Using -D mapreduce.job.reduces=10
with desired number will spawn that many reducers at runtime.
In the code,you can configure JobConf
variable to set number of mappers and reducers. Lets say we have JobConf
variable as job.
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setNumReduceTasks(10); // 10 reducers
You can also split the file into smaller size for this particular job to avoid memory issue.
If you are still getting issue, please check yarn log and post the log.