bash command to print column at specific range of line numbers

前端 未结 3 1593
独厮守ぢ
独厮守ぢ 2020-12-22 02:20

I\'m trying to get the values in column X at lines 5 to 5 + Y. I\'m guessing there\'s a quick way to do this with awk. How is this done?

相关标签:
3条回答
  • 2020-12-22 02:59

    If by "column" you mean you have a file with, say, comma-delimited fields and you want to extract a particular field, the accepted answer does that nicely. To recap,

    awk -F , 'NR==5 { print $6 }' file
    

    to extract the sixth field from line number 5 in a comma-separated file. If your delimiter is not comma, pass something else as the argument to the -F option. (With GNU Awk you can pass a regex to -F to specify fairly complex column delimiters, but if you need that, go find a more specific question about that particular scenario.)

    If by "column" you mean a fixed character position within a line, the substr function does that.

    awk 'NR == 5 { print substr($0, 6) }' file
    

    prints the sixth column and everything after it. If you want to restrict to a fixed width,

    awk 'NR == 5 { print substr($0, 6, 7) }' file
    

    prints seven characters starting at offset 6 (Awk indexing starts at 1, so offset 1 is the first character on the line) on line 5. If you don't know exactly how many characters to extract, but you want a number, Awk conveniently allows you to extract the number from the start of a string:

    awk 'NR == 5 { print 0 + substr($0, 6, 7) }' file
    

    will extract the same 7 characters but then coerce the result to a number, effectively trimming any non-numeric suffix, and print that.

    In the most general case, you might want to perform further splitting on the value you have extracted.

    awk 'NR == 5 { split(substr($0, 6), a, /:/); print a[1] }' file
    

    will split the extracted substring on the regex /:/ (in this trivial case, the regex simply matches a literal colon character) into the array a. We then print the first element of a, meaning we ditch everything starting from the first colon in the substring which starts at index 6 and extends through to the end of the line on line number 5.

    (To spare you from having to look it up, $0 is the entire current input line. Awk processes a file line by line, running the body of the script on each line in turn. If you need to expose shell variables to Awk, awk -v awkvariable="$shellvariable" does that.)

    0 讨论(0)
  • 2020-12-22 03:03

    I think this will work for you, untested:

    awk 'NR >= 5 && NR <= 5 + Y { print $X }' file.txt
    

    Obviously, substitute X and Y for some real values.

    EDIT:

    If X and Y are shell variables:

    awk -v column="$X" -v range="$Y" 'NR >= 5 && NR <= 5 + range { print $column }' file.txt
    
    0 讨论(0)
  • 2020-12-22 03:11

    Use awk to print column 2 of lines 5 to 10:

    awk 'NR==5,NR==10 {print $2}' <file                           # white space delim. columns
    awk 'NR==5,NR==10 {print $2}; NR==10 {exit}' <file            # optimized
    awk -F: 'NR==5,NR==10 {print $2}; NR==10 {exit}' </etc/passwd # colon delimited columns
    

    The optimization is that it exits after the last line of the desired range has been printed.

    A range pattern was used:

    A range pattern is made of two patterns separated by a comma, in the form ‘begpat, endpat’. It is used to match ranges of consecutive input records.
    https://www.gnu.org/software/gawk/manual/html_node/Ranges.html

    A pattern can be either a regexp pattern or an expression pattern. Above uses expression patterns to do comparisons with NR.

    I assumed white space delimited columns, but provided an example of specifying a different delimiter with the -F option.

    0 讨论(0)
提交回复
热议问题