Mac下使用IDEA远程连接Hadoop调试MapReduce程序,参考网上博客,总会出现如题报错,下面是我在mac下的一种解决办法,可以参考。
前期准备
如果想远程调试,需要做一些准备工作,简单罗列下。
(1)在本地准备一份了hadoop(有博主直接从集群中copy一份),设置环境变量。
# hadoop路径为具体路径
export HADOOP_HOME=/Users/yangchaolin/hadoop2.6.0/hadoop-2.6.0-cdh5.14.0
(2)IDEA工程下,将本地hadoop中share文件下的资源jar包都引入到项目中。
(3) 准备MapReduce程序,并创建一个application,这个application使用的工作目录就使用本地hadoop。
map端程序
1 package com.kaikeba.mapreduce;
2
3 import org.apache.hadoop.io.IntWritable;
4 import org.apache.hadoop.io.LongWritable;
5 import org.apache.hadoop.io.Text;
6 import org.apache.hadoop.mapreduce.Mapper;
7
8 import java.io.IOException;
9
10 /**
11 * mapreduce's map
12 */
13 public class WordCountMap extends Mapper<LongWritable, Text,Text, IntWritable> {
14 //most application should override map method
15 @Override
16 protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
17 //split
18 String readLine = value.toString();
19 String[] words = readLine.split(" ");
20 //words output to disk
21 for(String word:words){
22 context.write(new Text(word),new IntWritable(1));
23 }
24 }
25 }
reduce端程序
1 package com.kaikeba.mapreduce;
2
3 import org.apache.hadoop.io.IntWritable;
4 import org.apache.hadoop.io.Text;
5 import org.apache.hadoop.mapreduce.Reducer;
6
7 import java.io.IOException;
8
9 /**
10 * mapreduce's reduce
11 */
12 public class WordCountReduce extends Reducer<Text, IntWritable,Text,IntWritable> {
13 //should override reduce method
14 @Override
15 protected void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
16 //count words count
17 int count=0;
18 for(IntWritable i:values){
19 count+=i.get();
20 }
21 //output key value to hdfs
22 context.write(key,new IntWritable(count));
23 }
24 }
main函数
1 package com.kaikeba.mapreduce;
2
3 import org.apache.hadoop.conf.Configuration;
4 import org.apache.hadoop.fs.Path;
5 import org.apache.hadoop.io.IntWritable;
6 import org.apache.hadoop.io.Text;
7 import org.apache.hadoop.mapreduce.Job;
8 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
9 import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
10 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
11 import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
12
13 import java.io.IOException;
14
15 /**
16 * mapreduce's main method, 'WordCountMain' is mapreduce's job name
17 */
18 public class WordCountMain {
19 public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {
20 //check args,args is input path and output path
21 if(args==null||args.length!=2){
22 System.out.println("please input path");
23 System.exit(0);//exit main method
24 }
25 //create mapreduce job
26 Configuration conf=new Configuration();
27 //if run mapreduce in cluster,set conf and mapred-site should set yarn
28 //conf.set("mapreduce.job.jar","/home/hadoop/IdeaProject/hadoop/target/hadoop-1.0-SNAPSHOT.jar");
29 //conf.set("mapreduce.app-submission.cross-platform","true");
30 //conf.set("mapreduce.framework.name","yarn");
31
32 /**
33 * Creates a new Job with no particular Cluster and a given jobName.
34 * A Cluster will be created from the conf parameter only when it's needed
35 */
36 Job job=Job.getInstance(conf,WordCountMain.class.getSimpleName());//get class simplename
37 //set mapreduce job
38 job.setJarByClass(WordCountMain.class);//Set the Jar by finding where a given class came from
39 //set input/output format
40 job.setInputFormatClass(TextInputFormat.class);
41 job.setOutputFormatClass(TextOutputFormat.class);
42 //set input/output path
43 FileInputFormat.setInputPaths(job,new Path(args[0]));
44 FileOutputFormat.setOutputPath(job,new Path(args[1]));
45 //set map/reduce class
46 job.setMapperClass(WordCountMap.class);
47 job.setReducerClass(WordCountReduce.class);
48 //add combine class
49 job.setCombinerClass(WordCountReduce.class);
50 //set map/reduce output key-value type
51 //map
52 job.setMapOutputKeyClass(Text.class);
53 job.setMapOutputValueClass(IntWritable.class);
54 //reduce
55 job.setOutputKeyClass(Text.class);
56 job.setOutputValueClass(IntWritable.class);
57 //set job number ---> 4 reduce task
58 job.setNumReduceTasks(4);
59 //job commit
60 try{
61 job.waitForCompletion(true);
62 }catch(Exception e){
63 e.printStackTrace();
64 }
65 }
66 }
其他资源文件内容省略,因为本文报错跟资源配置文件没有关系。
程序使用maven打成jar包,使用hadoop jar 类的全路径名 输入路径 输出路径的命令在linux下执行是没有问题,具体过程省略。
报错解决
接下来使用IDEA远程调试hadoop,配置好上文创建的application后,发现执行会报如下错误,说找不到路径或目录。
1 19/10/14 00:53:02 WARN security.UserGroupInformation: PriviledgedActionException as:yangchaolin (auth:SIMPLE) cause:ExitCodeException exitCode=1: chmod: /kkb/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/tempDatas/mapred/staging/yangchaolin90307621/.staging/job_local90307621_0001: No such file or directory
2
3 Exception in thread "main" ExitCodeException exitCode=1: chmod: /kkb/install/hadoop-2.6.0-cdh5.14.2/hadoopDatas/tempDatas/mapred/staging/yangchaolin90307621/.staging/job_local90307621_0001: No such file or directory
4
5 at org.apache.hadoop.util.Shell.runCommand(Shell.java:604)
6 at org.apache.hadoop.util.Shell.run(Shell.java:507)
7 at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:789)
8 at org.apache.hadoop.util.Shell.execCommand(Shell.java:882)
9 at org.apache.hadoop.util.Shell.execCommand(Shell.java:865)
10 at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:720)
11 at org.apache.hadoop.fs.ChecksumFileSystem$1.apply(ChecksumFileSystem.java:498)
12 at org.apache.hadoop.fs.ChecksumFileSystem$FsOperation.run(ChecksumFileSystem.java:479)
13 at org.apache.hadoop.fs.ChecksumFileSystem.setPermission(ChecksumFileSystem.java:495)
14 at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:616)
15 at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:814)
16 at org.apache.hadoop.mapred.JobClient.copyAndConfigureFiles(JobClient.java:774)
17 at org.apache.hadoop.mapred.JobClient.access$400(JobClient.java:178)
18 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:991)
19 at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:976)
20 at java.security.AccessController.doPrivileged(Native Method)
21 at javax.security.auth.Subject.doAs(Subject.java:422)
22 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
23 at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:976)
24 at org.apache.hadoop.mapreduce.Job.submit(Job.java:582)
25 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:612)
26 at com.kaikeba.mapreduce.WordCountMain.main(WordCountMain.java:60)
27
28 Process finished with exit code 1
(1)尝试将集群中hadoop安装目录权限修改为777,继续测试,依然报错。
(2)根据报错说找不到目录,参考使用window本的童鞋,发现他们会在磁盘下自动创建一个目录(这个目录是在资源配置文件中指定的),查看我的mac发现并没有这个目录,尝试在mac的根目录下创建主目录,并赋予777权限,再次执行竟然通过了!!!怀疑跟mac下用户的权限有关系,因为window下一般用户都是管理员,而mac不是。
# 进入根目录
youngchaolinMac:/ yangchaolin$ cd /
# 创建主目录
youngchaolinMac:/ yangchaolin$ sudo mkdir kkb
Password:
youngchaolinMac:/ yangchaolin$ ls -l
total 45...
drwxr-xr-x 2 root wheel 68 10 23 22:52 kkb
# 赋予777权限
youngchaolinMac:/ yangchaolin$ sudo chmod -R 777 kkb
youngchaolinMac:/ yangchaolin$ ls -l
total 45...
drwxrwxrwx 2 root wheel 68 10 23 22:52 kkb
IDEA执行通过,完成MapReduce任务。
集群中查看计算结果,ok。
网上查了很多博客,有这个报警的各种解决方法,但是很多是window下。如果是mac,可参考本文,这可能是这个报错的一种解决方法。
参考博文:
(1)https://www.cnblogs.com/yjmyzz/p/how-to-remote-debug-hadoop-with-eclipse-and-intellij-idea.html
来源:oschina
链接:https://my.oschina.net/u/4408513/blog/3422615