delete partitions folders in hdfs older than N days

后端 未结 2 1255
孤独总比滥情好
孤独总比滥情好 2021-01-28 14:33

I want to delete the partition folders which are older than N days.

The below command gives the folders which are exactly 50 days ago. I want the list of all folders wh

相关标签:
2条回答
  • 2021-01-28 15:16

    It can be done with a bash script

    today=`date +'%s'`
    hdfs dfs -ls /data/publish/DMPD/VMCP/staging/tvmcpr_usr_prof/ | grep "^d" | while read line ; do 
    dir_date=$(echo ${line} | awk '{print $6}')
    difference=$(( ( ${today} - $(date -d ${dir_date} +%s) ) / ( 24*60*60 ) ))
    filePath=$(echo ${line} | awk '{print $8}')
    
    if [ ${difference} -lt 50 ]; then
        echo "${filepath}"
    fi
    done
    
    0 讨论(0)
  • 2021-01-28 15:17

    You can try with solr hdfsfindtool:

    hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-job.jar org.apache.solr.hadoop.HdfsFindTool -find /data/publish/DMPD/VMCP/staging/tvmcpr_usr_prof -mtime +50 | xargs hdfs dfs -rm -r -skipTrash
    
    0 讨论(0)
提交回复
热议问题