why am I getting spaces in sed 's/[a-z]*/(&)/g'

后端 未结 3 1624
甜味超标
甜味超标 2021-01-16 04:28

I want to add parenthesis to all words I used

sed \'s/[a-z]*/(&)/g\' 

inputfile.txt

hola crayola123456
abc123456


        
相关标签:
3条回答
  • 2021-01-16 04:55

    Two errors:

    1. * means 0 or more matches, you need at least one match, then +;
    2. sed (OSX version) uses basic regexp by default (so + isn't available), you should activate extended regexp syntax with option -E.

    Then:

    echo "hola abc1234 foo12 bar" | sed -E 's/[a-z]+/(&)/g'
    

    produces:

    (hola) (abc)1234 (foo)12 (bar)
    
    0 讨论(0)
  • 2021-01-16 04:56

    Actually sed is quite inconsistent in handling matches. From pure regex theory I would tell that any sequence [a-z]* in the line should emit the (&), so the theoretical perfect result would be (hola)() (crayola)()1()2()3()4()5()6, imho: First match [a-z]* hola, then match [a-z]* as the empty string for the next char , as did not match it will be echoed ... and so on...

    The Plan9 sed for example emits (hola)() (crayola)()1()2()3()4()5()6.

    What the Linux and BSD/Mac sed do here is quite strange. You can see the effect if you compare "hola1" with "hola1a": (hola)1() and (hola)1(a).

    0 讨论(0)
  • 2021-01-16 04:59

    The reason is that you are using a regex that can match an empty string. [a-z]* can match any empty space before a char since regex "sees" (i.e. checks) these positions. You need to replace the * (matching zero or more occurrences) with + quantifier (to match one or more characters).

    Here is an example of how this can be implemented in GNU sed:

    echo "hola crayola123456" | sed 's/[a-z]\+/(&)/g' 
    

    See the online demo

    On Mac, as per anubhava's comment, you need to use E option and use an unescaped +:

    echo "hola crayola123456" | sed -E 's/[a-z]+/(&)/g'
    
    0 讨论(0)
提交回复
热议问题