问题
The last few days I've been experimenting with Hadoop. I'm running Hadoop in pseudo-distributed mode on Ubuntu 12.10 and successfully executed some standard MapReduce jobs.
Next I wanted to start experimenting with HBase. I've installed HBase, played a bit in the shell. That all went fine so I wanted to experiment with HBase through a simple Java program. I wanted to import the output of one of the previous MapReduce jobs and load it into an HBase table. I've wrote a Mapper that should produce HFileOutputFormat
files that should easily read into a HBase table.
Now, whenever I run the program (using: hadoop jar [compiled jar]) I get a ClassNotFoundException
. The program seems unable to resolve com.google.commons.primitives.Long
. Of course, I thought it was just a dependency missing but the JAR (Google's Guava) is there.
I've tried a lot of different things but can't seem to find a solution.
I attached the Exception that occurs and the most important classes. I would be truly appreciated if someone could help me out or give me some advice on where to look.
Kind regards, Pieterjan
ERROR
12/12/13 09:02:54 WARN snappy.LoadSnappy: Snappy native library not loaded
12/12/13 09:03:00 INFO mapred.JobClient: Running job: job_201212130304_0020
12/12/13 09:03:01 INFO mapred.JobClient: map 0% reduce 0%
12/12/13 09:04:07 INFO mapred.JobClient: map 100% reduce 0%
12/12/13 09:04:51 INFO mapred.JobClient: Task Id : attempt_201212130304_0020_r_000000_0,Status : FAILED
Error: java.lang.ClassNotFoundException: com.google.common.primitives.Longs
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1554)
at org.apache.hadoop.hbase.KeyValue$KVComparator.compare(KeyValue.java:1536)
at java.util.TreeMap.compare(TreeMap.java:1188)
at java.util.TreeMap.put(TreeMap.java:531)
at java.util.TreeSet.add(TreeSet.java:255)
at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce(PutSortReducer.java:63)
at org.apache.hadoop.hbase.mapreduce.PutSortReducer.reduce(PutSortReducer.java:40)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
JAVA
Mapper:
public class TestHBaseMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Put> {
@Override
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
//Tab delimiter \t, white space delimiter: \\s+
String[] s = value.toString().split("\t");
Put put = new Put(s[0].getBytes());
put.add("amount".getBytes(), "value".getBytes(), value.getBytes());
context.write(new ImmutableBytesWritable(Bytes.toBytes(s[0])), put);
}
Job:
public class TestHBaseRun extends Configured implements Tool {
@Override
public int run(String[] args) throws Exception {
try {
Configuration configuration = getConf();
Job hbasejob = new Job(configuration);
hbasejob.setJobName("TestHBaseJob");
hbasejob.setJarByClass(TestHBaseRun.class);
//Specifies the InputFormat and the path.
hbasejob.setInputFormatClass(TextInputFormat.class);
TextInputFormat.setInputPaths(hbasejob, new Path("/hadoopdir/user/data/output/test/"));
//Set Mapper, MapperOutputKey and MapperOutputValue classes.
hbasejob.setMapperClass(TestHBaseMapper.class);
hbasejob.setMapOutputKeyClass(ImmutableBytesWritable.class);
hbasejob.setMapOutputValueClass(Put.class);
//Specifies the OutputFormat and the path. If The path exists it's reinitialized.
//In this case HFiles, that can be imported into HBase, are produced.
hbasejob.setOutputFormatClass(HFileOutputFormat.class);
FileSystem fs = FileSystem.get(configuration);
Path outputpath = new Path("/hadoopdir/user/data/hbase/table/");
fs.delete(outputpath, true);
HFileOutputFormat.setOutputPath(hbasejob, outputpath);
//Check if table exists in HBase and creates it if necessary.
HBaseUtil util = new HBaseUtil(configuration);
if (!util.exists("test")) {
util.createTable("test", new String[]{"amount"});
}
//Reads the existing (or thus newly created) table.
Configuration hbaseconfiguration = HBaseConfiguration.create(configuration);
HTable table = new HTable(hbaseconfiguration, "test");
//Write HFiles to disk. Autoconfigures partitioner and reducer.
HFileOutputFormat.configureIncrementalLoad(hbasejob, table);
boolean success = hbasejob.waitForCompletion(true);
//Load generated files into table.
LoadIncrementalHFiles loader;
loader = new LoadIncrementalHFiles(hbaseconfiguration);
loader.doBulkLoad(outputpath, table);
return success ? 0 : 1;
} catch (Exception ex) {
System.out.println("Error: " + ex.getMessage());
}
return 1;
}
回答1:
ClassNotFoundException, it means that the required .jar that contains com.google.common.primitives.Longs cannot be found.
There are several ways to solve this issue:
- If you're just playing with Hadoop, the simplest way to solve this issue is to copy the required .jar into /usr/share/hadoop/lib.
Add the path to the required .jar to
HADOOP_CLASSPATH
. To do so open /etc/hbase/hbase-env.sh and add:export HADOOP_CLASSPATH="<jar_files>:$HADOOP_CLASSPATH"
Create a folder /lib in your root project folder. Copy your .jar into that folder. Create a package (.jar) for your project. The result will be a fat jar contained all the jars included in /lib.
来源:https://stackoverflow.com/questions/13863382/load-mapreduce-output-data-into-hbase