How to convert .txt file to Hadoop's sequence file format

后端 未结 7 1572
独厮守ぢ
独厮守ぢ 2020-11-29 01:19

To effectively utilise map-reduce jobs in Hadoop, i need data to be stored in hadoop\'s sequence file format. However,currently the data is only in flat .txt format.Can anyo

相关标签:
7条回答
  • 2020-11-29 02:22

    If your data is not on HDFS, you need to upload it to HDFS. Two options:

    i) hdfs -put on your .txt file and once you get it on HDFS, you can convert it to seq file.

    ii) You take text file as input on your HDFS Client box and convert to SeqFile using Sequence File APIs by creating a SequenceFile.Writer and appending (key,values) to it.

    If you don't care about key, u can make line number as key and complete text as value.

    0 讨论(0)
提交回复
热议问题