I want to overwrite/reuse the existing output directory when I run my Hadoop job daily. Actually the output directory will store summarized output of each day\'s job run results
Jungblut's answer is your direct solution. Since I never trust automated processes to delete stuff (me personally), I'll suggest an alternative:
Instead of trying to overwrite, I suggest you make the output name of your job dynamic, including the time in which it ran.
Something like "/path/to/your/output-2011-10-09-23-04/
". This way you can keep around your old job output in case you ever need to revisit in. In my system, which runs 10+ daily jobs, we structure the output to be: /output/job1/2011/10/09/job1out/part-r-xxxxx
, /output/job1/2011/10/10/job1out/part-r-xxxxx
, etc.