Grep regular expression not working as expected

后端 未结 1 1669
暖寄归人
暖寄归人 2021-01-06 14:43

I have a simple grep command trying to get only the first column of a CSV file including the comma. It goes like this...

grep -Eo \'^[^,]+,\' so         


        
1条回答
  •  伪装坚强ぢ
    2021-01-06 15:15

    BSD grep is buggy in general. See the following related posts:

    • Why does this BSD grep result differ from GNU grep?
    • grep strange behaviour with single letter words
    • How to make BSD grep respect start-of-line anchor

    That last link above mentions your case: when -o option is used, grep ignores the ^ anchor for some reason. This issue is also described in a FreeBSD bug:

    I've noticed some more issues with the same version of grep. I don't know whether they're related, but I'll append them here for now.

    $ printf abc | grep -o '^[a-c]'

    should just print 'a', but instead gives three hits, against each letter of the incoming text.

    As a workaround, it might be a better idea to just install GNU grep that works as expected.

    Or, use sed with a BRE POSIX pattern:

    sed -i '' 's/^\([^,]*,\).*/\1/' file
    

    where the pattern matches

    • ^ - start of a line
    • \([^,]*,\) - Group 1 (later referred to with \1 backreference from the RHS):
      • [^,]* - zero or more chars other than ,
      • , - a , char
    • .* - the rest of the line.

    Note that -i will change the file contents inplace. Use -i.bak to create a backup file if needed (then, you wouldn't need the next empty '' though).

    0 讨论(0)
提交回复
热议问题