How to convert .txt file to Hadoop's sequence file format

后端未结

关注

 7  1576

To effectively utilise map-reduce jobs in Hadoop, i need data to be stored in hadoop\'s sequence file format. However,currently the data is only in flat .txt format.Can anyo

相关标签:

7条回答

太阳男子

2020-11-29 02:22

If your data is not on HDFS, you need to upload it to HDFS. Two options:

i) hdfs -put on your .txt file and once you get it on HDFS, you can convert it to seq file.

ii) You take text file as input on your HDFS Client box and convert to SeqFile using Sequence File APIs by creating a SequenceFile.Writer and appending (key,values) to it.

If you don't care about key, u can make line number as key and complete text as value.

0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2