In new API (apache.hadoop.mapreduce.KeyValueTextInputFormat) , how to specify separator (delimiter) other than tab(which is default) to separate key and Value.
Samp
For KeyValueTextInputFormat the input line should be a key value pair seperated by "\t"
Key1 Value1,Value2
By changing default seperator, You will be able to read as you wish.
For New Api
Here is the solution
//New API
Configuration conf = new Configuration();
conf.set("key.value.separator.in.input.line", ",");
Job job = new Job(conf);
job.setInputFormatClass(KeyValueTextInputFormat.class);
Map
public class Map extends Mapper<Text, Text, Text, IntWritable> {
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(Text key, Text value, Context context)
throws IOException, InterruptedException {
String line = value.toString();
System.out.println("key---> "+key);
System.out.println("value---> "+value.toString());
.
.
Output
key---> one
value---> first line
key---> two
value---> second line