awk to print all columns from the nth to the last with spaces

后端 未结 4 612
不知归路
不知归路 2021-01-23 21:26

I have the following input file:

a 1  o p
b  2 o p p
c     3 o p p  p

in the last line there is a double space between the last p\'s

相关标签:
4条回答
  • 2021-01-23 22:06

    GNU sed

    remove first n fields

    sed -r 's/([^ ]+ +){2}//' file
    

    GNU awk 4.0+

    awk '{sub("([^"FS"]"FS"){2}","")}1' file
    

    GNU awk <4.0

    awk --re-interval '{sub("([^"FS"]"FS"){2}","")}1' file
    

    Incase FS one doesn't work(Eds suggestion)

    awk '{sub(/([^ ] ){2}/,"")}1' file
    

    Replace 2 with number of fields you wish to remove

    EDIT

    Another way(doesn't require re-interval)

    awk '{for(i=0;i<2;i++)sub($1"[[:space:]]*","")}1' file
    

    Further edit

    As advised by EdMorton it is bad to use fields in sub as they may contain metacharacters so here is an alternative(again!)

    awk '{for(i=0;i<2;i++)sub(/[^[:space:]]+[[:space:]]*/,"")}1' file
    

    Output

    o p
    o p p
    o p p  p
    
    0 讨论(0)
  • 2021-01-23 22:08

    Since you want to preserve spaces, let's just use cut:

    $ cut -d' ' -f2- file
    1 o p
    2 o p p
    3 o p p  p
    

    Or for example to start by column 4:

    $ cut -d' ' -f4- file
    p
    p p
    p p  p
    

    This will work as long as the columns you are removing are one-space separated.


    If the columns you are removing also contain different amount of spaces, you can use the beautiful solution by Ed Morton in Print all but the first three columns:

    awk '{sub(/[[:space:]]*([^[:space:]]+[[:space:]]+){1}/,"")}1'
                                                       ^
                                            number of cols to remove
    

    Test

    $ cat a
    a 1 o p
    b    2 o p p
    c  3 o p p  p
    $ awk '{sub(/[[:space:]]*([^[:space:]]+[[:space:]]+){2}/,"")}1' a
    o p
    o p p
    o p p  p
    
    0 讨论(0)
  • 2021-01-23 22:12

    In Perl, you can use split with capturing to keep the delimiters:

    perl -ne '@f = split /( +)/; print @f[ 1 * 2 .. $#f ]'
    #                                      ^
    #                                      |
    #                              column number goes
    #                              here (starting from 0)
    
    0 讨论(0)
  • 2021-01-23 22:25

    If you want to preserve all spaces after the start of the second column, this will do the trick:

    {
        match($0, ($1 "[ \\t*]+"))
        print substr($0, RSTART+RLENGTH)
    }
    

    The call to match locates the start of the first 'token' on the line and the length of the first token and the whitespace that follows it. Then you just print everything on the line after that.

    You could generalize it somewhat to ignore the first N tokens this way:

    BEGIN {
        N = 2
    }
    
    {
        r = ""
        for (i=1; i<=N; i++) {
            r = (r $i "[ \\t*]+")
        }
        match($0, r)
        print substr($0, RSTART+RLENGTH)
    }
    

    Applying the above script to your example input yields:

    o p
    o p p
    o p p  p
    
    0 讨论(0)
提交回复
热议问题