How to overwrite/reuse the existing output path for Hadoop jobs again and agian

后端未结

关注

 10  899

既然无缘 2021-02-12 10:29

I want to overwrite/reuse the existing output directory when I run my Hadoop job daily. Actually the output directory will store summarized output of each day\'s job run results

10条回答

醉梦人生 (楼主)

2021-02-12 11:20

Hadoop follows the philosophy Write Once, Read Many times. Thus when you try to write to the directory again, it assumes it has to make a new one (Write once) but it already exists, and so it complains. You can delete it via hadoop fs -rmr /path/to/your/output/. It's better to create a dynamic directory (eg,based on timestamp or hash value) in order to preserve data.

0 讨论(0)

查看其它10个回答
发布评论:

提交评论
- 加载中...