问题
The files I'm looking for are of the form cmn-我.flac
, where the CJK character is variable.
Using find
command, what regexp should I use to find all files with a single CJK characters in its name?
Hints: The following regexp find all files including those with and without CJK characters :
find ./ -regex '.*\..*' # ex: cmn-我.flac
Then :
find ./ -regex "cmn-.*[\x4e00-\x9fa5]*\.flac" # the `-` breaks => fails
find ./ -regex ".*[\x4e00-\x9fa5]*\.flac" # finds with n CJK characters => we get closer!
find ./ -regex ".*[\x4e00-\x9fa5]{1}\.flac" # the `{1}` breaks => fails.
find ./ -regex ".*[\x4e00-\x9fa5]?\.flac" # the `?` breaks => fails.
How to make it works ?
回答1:
I think you're on the correct way and need to look a bit more at the find man page (e.g. -regextype
).
Can't reproduce
find ./ -regex "cmn-.*[\x4e00-\x9fa5]*\.xml"
# find: Invalid range end
find
's version
First, Be sure to check which version of find
you're using, there is some differences between implementation:
find --version
Give:
find (GNU findutils) 4.4.2
…
Explanation
Looking at the -regex-type
option I only see POSIX
regular expression types: emacs
(default), posix-awk
, posix-basic
, posix-egrep
and posix-extended
).
Which doesn't support custom hex range definition (compare Perl with POSIX).
回答2:
There was an error in the regex, outside of the CJK matching part. The file form to match is not
cmn-我.flac
but is rather :
./cmn-我.flac
The following command fully works, matching
./cmn-*.flac
where*
is any single character, including CJK :find ./ -regex "./cmn-.\.flac"
The following fully works, matching
./cmn-*.flac
where*
is any single CJK character.<< NOT yet found ! Help welcome! >>
来源:https://stackoverflow.com/questions/24826457/what-regex-to-find-files-with-cjk-characters-using-find-command