Getting Filename/FileData as key/value input for Map when running a Hadoop MapReduce Job

后端未结

关注

 1  2000

I went through the question How to get Filename/File Contents as key/value input for MAP when running a Hadoop MapReduce Job? here. Though it explains the concept, I am unab

相关标签:

1条回答

南笙

2020-12-20 02:08

Have this code in your CustomRecordReader class.

private LineRecordReader lineReader;

private String fileName;

public CustomRecordReader(JobConf job, FileSplit split) throws IOException {
    lineReader = new LineRecordReader(job, split);
    fileName = split.getPath().getName();
}

public boolean next(Text key, Text value) throws IOException {
    // get the next line
    if (!lineReader.next(key, value)) {
        return false;
    }    

    key.set(fileName);
    value.set(value);

    return true;
}

public Text createKey() {
    return new Text("");
}

public Text createValue() {
    return new Text("");
}

Remove SPDRecordReader constructor (It is an error).

And have this code in your CustomFileInputFormat class

public RecordReader<Text, Text> getRecordReader(
  InputSplit input, JobConf job, Reporter reporter)
  throws IOException {

    reporter.setStatus(input.toString());
    return new CustomRecordReader(job, (FileSplit)input);
}

0 讨论(0)