Hadoop put performance - large file (20gb)

后端 未结 3 1997
佛祖请我去吃肉
佛祖请我去吃肉 2021-02-04 10:03

I\'m using hdfs -put to load a large 20GB file into hdfs. Currently the process runs @ 4mins. I\'m trying to improve the write time of loading data into hdfs. I tried utilizing

3条回答
  •  旧时难觅i
    2021-02-04 10:30

    you may want to use distcp hadoop distcp -Ddfs.block.size=$[256*1024*1024] /path/to/inputdata /path/to/outputdata to perform parallel copy

提交回复
热议问题