i am trying to run a wordcount job in hadoop.but always getting a class not found exception.I am posting the class that i wrote and the command i using to run the job
Though MapReduce program is parallel processing. Mapper, Combiner and Reducer class has sequence flow. Have to wait for completing each flow depends on other class so need job.waitForCompletion(true);
But It must to set input and output path before starting Mapper, Combiner and Reducer class. Reference
Change your code like this:
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = new Job(conf, "WordCount");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setJarByClass(WordCount.class);
job.waitForCompletion(true);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(Map.class);
job.setReducerClass(Reduce.class);
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
}
I hope this will works.
try job.setJar("wordcount.jar");
, where wordcount.jar is the jar file that you are going to package to.
This method works for me, but NOT setJarByClass
!
I suspect this :
14/01/27 13:16:02 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
I got the same error when using CDH4.6 and it got solved after resolving the above warning.
Use The below code for resolving this Problem. job.setJarByClass(DriverClass.class);
Try adding this
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);
I also got the same issue and fixed it by removing same WordCount.class file in the same directory from where I am executing my jar. Looks like it is taking the class out side the jar. Try