Key of object type in the hadoop mapper

后端 未结 1 1316
说谎
说谎 2021-01-13 06:08

New to hadoop and trying to understand the mapreduce wordcount example code from here.

The mapper from documentation is -

Mapper

        
相关标签:
1条回答
  • 2021-01-13 06:38

    InputFormat describes the input-specification for a Map-Reduce job.By default, hadoop uses TextInputFormat, which inherits FileInputFormat, to process the input files.

    We can also specify the input format to use in the client or driver code:

    job.setInputFormatClass(SomeInputFormat.class);
    

    For the TextInputFormat, files are broken into lines. Keys are the position in the file, and values are the line of text.

    In the public void map(Object key, Text value, Context context) , key is the line offset and value is the actual text.

    Please look at TextInputFormat API https://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/lib/input/TextInputFormat.html

    By default, Key is LongWritable type and value is of type Text for the TextInputFormat.In your example, Object type is specified in the place of LongWritable as it is compatible. You can also use LongWritable type in the place of Object

    0 讨论(0)
提交回复
热议问题