Filter column with awk and regexp

前端 未结 6 970
谎友^
谎友^ 2021-02-01 19:08

I\'ve a pretty simple question. I\'ve a file containing several columns and I want to filter them using awk.

So the column of interest is the 6th column and I want to fi

6条回答
  •  -上瘾入骨i
    2021-02-01 19:45

    This should do the trick:

    awk '$6~/^(([1-9]|[1-9][0-9]|100)[SM]){2}$/' file
    

    Regexplanation:

    ^                        # Match the start of the string
    (([1-9]|[1-9][0-9]|100)  # Match a single digit 1-9 or double digit 10-99 or 100
    [SM]                     # Character class matching the character S or M
    ){2}                     # Repeat everything in the parens twice
    $                        # Match the end of the string
    

    You have quite a few issue with your statement:

    awk '{ if($6 == '/[1-100][S|M][1-100][S|M]/') print} file.txt
    
    • == is the string comparision operator. The regex comparision operator is ~.
    • You don't quote regex strings (you never quote anything with single quotes in awk beside the script itself) and your script is missing the final (legal) single quote.
    • [0-9] is the character class for the digit characters, it's not a numeric range. It means match against any character in the class 0,1,2,3,4,5,6,7,8,9 not any numerical value inside the range so [1-100] is not the regular expression for digits in the numerical range 1 - 100 it would match either a 1 or a 0.
    • [SM] is equivalent to (S|M) what you tried [S|M] is the same as (S|\||M). You don't need the OR operator in a character class.

    Awk using the following structure condition{action}. If the condition is True the actions in the following block {} get executed for the current record being read. The condition in my solution is $6~/^(([1-9]|[1-9][0-9]|100)[SM]){2}$/ which can be read as does the sixth column match the regular expression, if True the line gets printed because if you don't get any actions then awk will execute {print $0} by default.

提交回复
热议问题