Grep and regex - why am I escaping curly braces?

后端 未结 3 1715
遥遥无期
遥遥无期 2020-12-03 14:41

I\'m deeply puzzled by the way grep seems to parse a regex:

$ echo \"@NS500287\" | grep \'^@NS500[0-9]{3}\'
#nothing
$ echo \"@NS500287\" | grep \'^@NS500[0-         


        
相关标签:
3条回答
  • 2020-12-03 15:07

    Instead do

    echo '@NS500287' | egrep '^@NS500[0-9]{3}'
    #                  ^
    #                 /
    #       notice ---
    
    0 讨论(0)
  • 2020-12-03 15:08

    The answer relates to the difference between Basic Regular Expressions (BREs) and Extended ones (EREs).

    • In BRE mode (i.e. when you call grep with no argument to specify otherwise), the { and } are interpreted as literal characters. Escaping them with \ means that they are to be interpreted as a number of instances of the previous pattern.

    • If you were to use grep -E instead (ERE mode), you would be able to use { and } without escaping to refer to the count. In ERE mode, escaping the braces causes them to be interpreted literally instead.

    0 讨论(0)
  • 2020-12-03 15:09

    This is because {} are special characters and they need to handled differently to have this special behaviour. Otherwise, they will be treated as literal { and }.

    You can either escape like you did:

    $ echo "@NS500287" | grep '^@NS500[0-9]\{3\}'
    @NS500287
    

    or use grep -E:

    $ echo "@NS500287" | grep -E '^@NS500[0-9]{3}'
    @NS500287
    

    Without any processing:

    $ echo "he{llo" | grep "{"
    he{llo
    

    From man grep:

    -E, --extended-regexp

    Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)

    ...

    REGULAR EXPRESSIONS

    A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.

    grep understands three different versions of regular expression syntax: “basic,” “extended” and “perl.” In GNU grep, there is no difference in available functionality between basic and extended syntaxes. In other implementations, basic regular expressions are less powerful. The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards. Perl regular expressions give additional functionality, and are documented in pcresyntax(3) and pcrepattern(3), but may not be available on every system.

    ...

    Basic vs Extended Regular Expressions

    In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).

    0 讨论(0)
提交回复
热议问题