Hadoop put performance - large file (20gb)

后端未结

关注

 3  1997

佛祖请我去吃肉 2021-02-04 10:03

I\'m using hdfs -put to load a large 20GB file into hdfs. Currently the process runs @ 4mins. I\'m trying to improve the write time of loading data into hdfs. I tried utilizing

3条回答

旧时难觅i (楼主)

2021-02-04 10:30

you may want to use distcp hadoop distcp -Ddfs.block.size=$[256*1024*1024] /path/to/inputdata /path/to/outputdata to perform parallel copy

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...