I wonder the general rule to use regular expression in if clause in bash?
Here is an example
$ gg=svm-grid-ch
$ if [[ $gg == *grid* ]] ; then echo
@OP,
Is glob pettern not only used for file names?
No, "glob" pattern is not only used for file names. you an use it to compare strings as well. In your examples, you can use case/esac to look for strings patterns.
gg=svm-grid-ch
# looking for the word "grid" in the string $gg
case "$gg" in
*grid* ) echo "found";;
esac
# [[ $gg =~ ^....grid* ]]
case "$gg" in ????grid*) echo "found";; esac
# [[ $gg =~ s...grid* ]]
case "$gg" in s???grid*) echo "found";; esac
In bash, when to use glob pattern and when to use regular expression? Thanks!
Regex are more versatile and "convenient" than "glob patterns", however unless you are doing complex tasks that "globbing/extended globbing" cannot provide easily, then there's no need to use regex.
Regex are not supported for version of bash <3.2 (as dennis mentioned), but you can still use extended globbing (by setting extglob
). for extended globbing, see here and some simple examples here.
Update for OP: Example to find files that start with 2 characters (the dots "." means 1 char) followed by "g" using regex
eg output
$ shopt -s dotglob
$ ls -1 *
abg
degree
..g
$ for file in *; do [[ $file =~ "..g" ]] && echo $file ; done
abg
degree
..g
In the above, the files are matched because their names contain 2 characters followed by "g". (ie ..g
).
The equivalent with globbing will be something like this: (look at reference for meaning of ?
and *
)
$ for file in ??g*; do echo $file; done
abg
degree
..g
When using a glob pattern, a question mark represents a single character and an asterisk represents a sequence of zero or more characters:
if [[ $gg == ????grid* ]] ; then echo $gg; fi
When using a regular expression, a dot represents a single character and an asterisk represents zero or more of the preceding character. So ".*
" represents zero or more of any character, "a*
" represents zero or more "a", "[0-9]*
" represents zero or more digits. Another useful one (among many) is the plus sign which represents one or more of the preceding character. So "[a-z]+
" represents one or more lowercase alpha character (in the C locale - and some others).
if [[ $gg =~ ^....grid.*$ ]] ; then echo $gg; fi
if [[ $gg =~ ^....grid.* ]]
Adding this solution with grep
and basic sh
builtins for those interested in a more portable solution (independent of bash
version; also works with plain old sh
, on non-Linux platforms etc.)
# GLOB matching
gg=svm-grid-ch
case "$gg" in
*grid*) echo $gg ;;
esac
# REGEXP
if echo "$gg" | grep '^....grid*' >/dev/null ; then echo $gg ; fi
if echo "$gg" | grep '....grid*' >/dev/null ; then echo $gg ; fi
if echo "$gg" | grep 's...grid*' >/dev/null ; then echo $gg ; fi
# Extended REGEXP
if echo "$gg" | egrep '(^....grid*|....grid*|s...grid*)' >/dev/null ; then
echo $gg
fi
Some grep
incarnations also support the -q
(quiet) option as an alternative to redirecting to /dev/null
, but the redirect is again the most portable.
Use
=~
for regular expression check Regular Expressions Tutorial Table of Contents