What regex to find files with CJK characters using `find` command?

北城以北 提交于 2019-12-11 06:45:01

问题


The files I'm looking for are of the form cmn-我.flac, where the CJK character is variable.

Using find command, what regexp should I use to find all files with a single CJK characters in its name?


Hints: The following regexp find all files including those with and without CJK characters :

find ./ -regex '.*\..*'  # ex: cmn-我.flac

Then :

find ./ -regex "cmn-.*[\x4e00-\x9fa5]*\.flac"   # the `-` breaks => fails 
find ./ -regex ".*[\x4e00-\x9fa5]*\.flac"       # finds with n CJK characters => we get closer!
find ./ -regex ".*[\x4e00-\x9fa5]{1}\.flac"     # the `{1}` breaks => fails. 
find ./ -regex ".*[\x4e00-\x9fa5]?\.flac"       # the `?` breaks => fails. 

How to make it works ?


回答1:


I think you're on the correct way and need to look a bit more at the find man page (e.g. -regextype).

Can't reproduce

find ./ -regex "cmn-.*[\x4e00-\x9fa5]*\.xml"
# find: Invalid range end

find's version

First, Be sure to check which version of find you're using, there is some differences between implementation:

find --version

Give:

find (GNU findutils) 4.4.2
…

Explanation

Looking at the -regex-type option I only see POSIX regular expression types: emacs (default), posix-awk, posix-basic, posix-egrep and posix-extended).

Which doesn't support custom hex range definition (compare Perl with POSIX).




回答2:


  1. There was an error in the regex, outside of the CJK matching part. The file form to match is not

    cmn-我.flac

    but is rather :

    ./cmn-我.flac

  2. The following command fully works, matching ./cmn-*.flac where * is any single character, including CJK :

    find ./ -regex "./cmn-.\.flac"

  3. The following fully works, matching ./cmn-*.flac where * is any single CJK character.

    << NOT yet found ! Help welcome! >>



来源:https://stackoverflow.com/questions/24826457/what-regex-to-find-files-with-cjk-characters-using-find-command

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!