文章说明:因Linux平台再GUI页面通过IDE进行Hadoop开发,会导致Linux在GUI上极度消耗资源,对于一些配置不是很高的PC,可能会出现卡顿的情况,非常影响程序编写,本文就详细介绍如何在windows平台进行hadoop开发,希望对各位学习Hadoop的同学优异
工具:
- hadoop eclipse插件:hadoop-eclipse-plugin-2.7.3.jar
- hadoop windows平台支持组件:winutils.exe
- hadoop 底层依赖库:hadoop.dll
上述工具下载地址:下载地址
Hadoop版本 : hadoop-2.7.3
配置步骤:
-
启动hadoop : start-yarn.sh、start-dfs.sh
-
windows本地配置Linux的主机IP映射:(不配置直接使用IP也行)
-
将hadoop-eclipse-plugin-2.7.3.jar放进eclipse的plugins目录,启动eclipse
-
eclipse配置Hadoop
- 切换MapReduce视图可以看到HDFS文件系统的信息
运行MapReduce程序
- 配置HADOOP环境变量:主要将bin、sbin放入PATH路径
- 将winutils.exe放在bin目录,hadoop.dll放在windows System32目录
- 测试代码
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
/**
* 统计文本词频信息
* @author Zerone1993
*/
public class WordCount {
static class WordMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
@Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
throws IOException, InterruptedException {
String str = value.toString();
StringTokenizer st = new StringTokenizer(str);
while(st.hasMoreTokens()){
String temp = st.nextToken();
context.write(new Text(temp), new IntWritable(1));
}
}
}
static class WordReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
@Override
protected void reduce(Text arg0, Iterable<IntWritable> arg1,
Reducer<Text, IntWritable, Text, IntWritable>.Context arg2) throws IOException, InterruptedException {
int sum = 0;
for(IntWritable temp : arg1){
sum = sum + temp.get();
}
arg2.write(new Text(arg0), new IntWritable(sum));
}
}
public static void main(String[] args) {
Configuration conf = new Configuration();
conf.set("mapred.job.tracker", "master:50020");
try{
Job job = Job.getInstance(conf, "wordCount");
job.setJarByClass(WordCount.class); //设置启动作业类
job.setMapperClass(WordMapper.class); //设置Map类
job.setReducerClass(WordReducer.class);
job.setMapOutputKeyClass(Text.class); //设置mapper输出的key类型
job.setMapOutputValueClass(IntWritable.class); //设置mapper输出的value类型
job.setNumReduceTasks(1); //设置Reduce Task的数量
//设置mapreduce的输入和输出目录
FileInputFormat.addInputPath(job, new Path("hdfs://master:9090/user/squirrel/input/mapreduce/"));
FileOutputFormat.setOutputPath(job, new Path("hdfs://master:9090/user/squirrel/output/mapreduce/") );
//等待mapreduce整个过程完成
System.exit(job.waitForCompletion(true)?0:1);
}catch(Exception e){
e.printStackTrace();
}
}
}
效果:
来源:oschina
链接:https://my.oschina.net/u/4074151/blog/3014236