I am trying to match on the presence of a word in a list before adding that word again (to avoid duplicates). I am using bash 4.2.24 and am trying the below:
Tangential to your question, but if you can use grep -E
(or egrep
, its effective, but obsolescent alias) in your script:
if grep -q -E "\b${myword}\b" <<<"$foo"; then
I ended up using this after flailing with bash's =~
.
Note that while regex constructs \<
, \>
, and \b
are not POSIX-compliant, both the BSD (macOS) and GNU (Linux) implementations of grep -E
support them, which makes this approach widely usable in practice.
Small caveat (not an issue in the case at hand): By not using =~
, you lose the ability to inspect capturing subexpressions (capture groups) via ${BASH_REMATCH[@]}
later.
This worked for me
bar='\<myword\>'
[[ $foo =~ $bar ]]
I've used the following to match word boundaries on older systems. The key is to wrap $foo
with spaces since [^[:alpha:]]
will not match words at the beginning or end of the list.
[[ " $foo " =~ [^[:alpha:]]myword[^[:alpha:]] ]]
Tweak the character class as needed based on the expected contents of myword
, otherwise this may not be good solution.
Yes, all the listed regex extensions are supported but you'll have better luck putting the pattern in a variable before using it. Try this:
re=\\bmyword\\b
[[ $foo =~ $re ]]
Digging around I found this question, whose answers seems to explain why the behaviour changes when the regex is written inline as in your example.
Editor's note: The linked question does not explain the OP's problem; it merely explains how starting with Bash version 3.2 regexes (or at least the special regex chars.) must by default be unquoted to be treated as such - which is exactly what the OP attempted.
However, the workarounds in this answer are effective.
You'll probably have to rewrite your tests so as to use a temporary variable for your regexes, or use the 3.1 compatibility mode:
shopt -s compat31
You can use grep, which is more portable than bash's regexp like this:
if echo $foo | grep -q '\<myword\>'; then
echo "MATCH";
else
echo "NO MATCH";
fi
Not exactly "\b", but for me more readable (and portable) than the other suggestions:
[[ $foo =~ (^| )myword($| ) ]]