Does bash support word boundary regular expressions?

前端 未结 8 1620
面向向阳花
面向向阳花 2020-12-01 05:30

I am trying to match on the presence of a word in a list before adding that word again (to avoid duplicates). I am using bash 4.2.24 and am trying the below:



        
相关标签:
8条回答
  • 2020-12-01 05:44

    Tangential to your question, but if you can use grep -E (or egrep, its effective, but obsolescent alias) in your script:

    if grep -q -E "\b${myword}\b" <<<"$foo"; then
    

    I ended up using this after flailing with bash's =~.

    Note that while regex constructs \<, \>, and \b are not POSIX-compliant, both the BSD (macOS) and GNU (Linux) implementations of grep -E support them, which makes this approach widely usable in practice.

    Small caveat (not an issue in the case at hand): By not using =~, you lose the ability to inspect capturing subexpressions (capture groups) via ${BASH_REMATCH[@]} later.

    0 讨论(0)
  • 2020-12-01 05:45

    This worked for me

    bar='\<myword\>'
    [[ $foo =~ $bar ]]
    
    0 讨论(0)
  • 2020-12-01 05:50

    I've used the following to match word boundaries on older systems. The key is to wrap $foo with spaces since [^[:alpha:]] will not match words at the beginning or end of the list.

    [[ " $foo " =~ [^[:alpha:]]myword[^[:alpha:]] ]]
    

    Tweak the character class as needed based on the expected contents of myword, otherwise this may not be good solution.

    0 讨论(0)
  • 2020-12-01 05:54

    Yes, all the listed regex extensions are supported but you'll have better luck putting the pattern in a variable before using it. Try this:

    re=\\bmyword\\b
    [[ $foo =~ $re ]]
    

    Digging around I found this question, whose answers seems to explain why the behaviour changes when the regex is written inline as in your example.

    Editor's note: The linked question does not explain the OP's problem; it merely explains how starting with Bash version 3.2 regexes (or at least the special regex chars.) must by default be unquoted to be treated as such - which is exactly what the OP attempted.
    However, the workarounds in this answer are effective.

    You'll probably have to rewrite your tests so as to use a temporary variable for your regexes, or use the 3.1 compatibility mode:

    shopt -s compat31
    
    0 讨论(0)
  • 2020-12-01 05:54

    You can use grep, which is more portable than bash's regexp like this:

    if echo $foo | grep -q '\<myword\>'; then 
        echo "MATCH"; 
    else 
        echo "NO MATCH"; 
    fi
    
    0 讨论(0)
  • 2020-12-01 06:00

    Not exactly "\b", but for me more readable (and portable) than the other suggestions:

    [[  $foo =~ (^| )myword($| ) ]]
    
    0 讨论(0)
提交回复
热议问题