问题
I recently installed the new Hadoop 2.2. I had previously written a simple Word Count MapReduce program which used to work with ease on CDH4. But now, I have problems with all org.apache.hadoop.mapreduce
imports. Can someone tell me which jar exactly to export to fix these imports? The code is as follows just in case someone needs to point out changes I need to make to make sure it runs in Hadoop 2.2.
import java.io.IOException;
import java.lang.InterruptedException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class MapRWordCount {
private final static IntWritable ONE = new IntWritable(1);
private final static Pattern WORD = Pattern.compile("\\w+");
public static class WordCountMapper
extends Mapper<LongWritable, Text, Text, IntWritable> {
private final Text word = new Text();
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String valueString = value.toString();
Matcher matcher = WORD.matcher(valueString);
while (matcher.find()) {
word.set(matcher.group().toLowerCase());
context.write(word, ONE);
}
}
}
public static class WordCountReducer
extends Reducer<Text, IntWritable, Text, IntWritable> {
private final IntWritable totalCount = new IntWritable();
@Override
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable value : values) {
sum += value.get();
}
totalCount.set(sum);
context.write(key, totalCount);
}
}
public static void main(String[] args)
throws IOException, ClassNotFoundException, InterruptedException {
if (args.length != 2) {
System.err.println("Usage: MapRWordCount <input_path> <output_path>");
System.exit(-1);
}
Job job = new Job();
job.setJarByClass(MapRWordCount.class);
job.setJobName("MapReduce Word Count");
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
job.setMapperClass(WordCountMapper.class);
job.setCombinerClass(WordCountReducer.class);
job.setReducerClass(WordCountReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
回答1:
I found the JARs in the following locations:
$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar
$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar
回答2:
In maven, I had to add the following to the pom.xml and then build cleanly to be able to find the Mapper and Reducer classes in Java:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.2.0</version>
</dependency>
Now the following don't throw errors:
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
回答3:
If you're just looking for the location of appropriate JARs in Hadoop 2.2, then look under share/hadoop/{common,hdfs,mapreduce}
. You will find files ending in -2.2.0.jar
that are likely what you are looking for.
This should be the same as in CDH4, unless you installed the "MR1" version, which matches the Hadoop 1.x structure.
回答4:
Use this link to find whatever JAR's file you need
Download them , Right click on your Project Go to Build Path > Configure Build Path > Add External JAR's
来源:https://stackoverflow.com/questions/19436361/issue-with-org-apache-hadoop-mapreduce-imports-in-apache-hadoop-2-2