I want to overwrite/reuse the existing output directory when I run my Hadoop job daily. Actually the output directory will store summarized output of each day\'s job run results
If one is loading the input file (with e.g., appended entries) from the local file system to hadoop distributed file system as such:
hdfs dfs -put /mylocalfile /user/cloudera/purchase
Then one could also overwrite/reuse the existing output directory with -f
. No need to delete or re-create the folder
hdfs dfs -put -f /updated_mylocalfile /user/cloudera/purchase