问题
I have a table with 9k partitions, of which I would like to delete about 1200 (which represents 3 days)
I would like to combine the hadoop fs -rm
and regular expressions for these 3 days, something like pr_load_time=2017070([1-4])(\d+)
.
The partitions look like this (I want to match only the first two here)
pr_load_time=20170701000317
pr_load_time=20170704133602
pr_load_time=20170705000317
pr_load_time=20170706133602
Is something like this possible? I was thinking about matching the partitions with awk and use xargs, but this seems to be really slow approach to delete such a big number of files.
回答1:
I guess above comment would solve your problem however you could try below in case
/hdfs path/pr_load_time={20170701000317,20170704133602,20170705000317,..}
or something like this
/hdfs path/pr_load_time=201707{01000317,04133602,05000317,..}
this can combine different pattern in single command
/hdfs path/pr_load_time=201707{01*,04*,05*,..}
来源:https://stackoverflow.com/questions/45536017/hadoop-fs-rm-with-regular-expression