How to sumit a mapreduce job to remote cluster configured with yarn?

问题

I am trying to execute a simple mapreduce program from eclipse .Following is my program

package wordcount;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

public class WordCount {
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        conf.set("fs.defaultFS", "hdfs://quickstart.cloudera:8020");
        conf.set("mapreduce.framework.name", "yarn");
        conf.set("yarn.resourcemanager.address", "quickstart.cloudera:8032");
        conf.set("yarn.app.mapreduce.am.staging-dir", "/user");
        Job job = Job.getInstance(conf);
        job.setJarByClass(WordCount.class);
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-app-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-yarn-common-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-common-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-yarn-api-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-core-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/hadoop-mapreduce-client-common-2.6.0-cdh5.7.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-logging-1.2.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/guava-15.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-collections-3.2.2.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/protobuf-java-2.5.0.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-configuration-1.7.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/commons-lang-2.6.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/log4j-1.2.16.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/slf4j-api-1.7.5.jar"));
        job.addFileToClassPath(new Path("/user/cloudera/prasad/jars/slf4j-log4j12-1.7.5.jar"));
        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        FileInputFormat.addInputPath(job, new Path("/user/cloudera/prasad/test.txt"));
        FileOutputFormat.setOutputPath(job, new Path("/user/cloudera/prasad/wordout2"));
        job.waitForCompletion(true);
    }
}

Initially when I run the above program, it was throwing some ClassNotFoundExceptions in container log, so I have added all the corresponding jars as written in the program.Now its not showing any errors in container log but it is the mapreduce job is getting failed.

But resource manager showing the below error

Exception from container-launch with container ID: container_1473338609943_0003_01_000001 and exit code: 1
ExitCodeException exitCode=1: 
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:561)
    at org.apache.hadoop.util.Shell.run(Shell.java:478)
    at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738)
    at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
    at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

when I click on application log it is not showing anything just displaying the following message.

 Log Type: stderr

Log Upload Time: Thu Sep 08 05:26:35 -0700 2016

Log Length: 243

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.


Log Type: stdout

Log Upload Time: Thu Sep 08 05:26:35 -0700 2016

Log Length: 0

Please let me know what is wrong with my program.

Following are screens of libs I am using

来源：https://stackoverflow.com/questions/39396103/how-to-sumit-a-mapreduce-job-to-remote-cluster-configured-with-yarn

标签

java

Hadoop

MapReduce

yarn

hadoop2