How to read hadoop sequential file?

房东的猫 提交于 2019-12-30 08:21:15

问题


I have a sequential file which is the output of hadoop map-reduce job. In this file data is written in key value pairs ,and value itself is a map. I want to read the value as a MAP object so that i can process it further.

    Configuration config = new Configuration();
    Path path = new Path("D:\\OSP\\sample_data\\data\\part-00000");
    SequenceFile.Reader reader = new SequenceFile.Reader(FileSystem.get(config), path, config);
    WritableComparable key = (WritableComparable) reader.getKeyClass().newInstance();
    Writable value = (Writable) reader.getValueClass().newInstance();
    long position = reader.getPosition();

    while(reader.next(key,value))
    {
           System.out.println("Key is: "+textKey +" value is: "+val+"\n");
    }

output of program: Key is: [this is key] value is: {abc=839177, xyz=548498, lmn=2, pqr=1}

Here i am getting value as string ,but i want it as a object of map.


回答1:


Check the API documentation for SequenceFile#next(Writable, Writable)

while(reader.next(key,value))
{
       System.out.println("Key is: "+textKey +" value is: "+val+"\n");
}

should be replaced with

while(reader.next(key,value))
{
       System.out.println("Key is: "+key +" value is: "+value+"\n");
}

Use SequenceFile.Reader#getValueClassName to get the value type in the SequenceFile. SequenceFile have the key/value types in the file header.



来源:https://stackoverflow.com/questions/8265256/how-to-read-hadoop-sequential-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!