Remove a set of rows from a file separated by a white line having a specific key word

前端 未结 2 781
刺人心
刺人心 2021-01-17 07:07

I have a file containing lines as given below. I want to delete a set of rows from the file, if any line from a set of rows contains key word SEDS2-TOP. Each set of rows is

2条回答
  •  伪装坚强ぢ
    2021-01-17 07:34

    You can do it in awk using 3-rules and the END rule. It can be written as follows:

    awk 'NF==0 {              # empty line
        for (i in a)          # for each line in array a
            print i           # output line (index)
        if (i in a)           # if lines exists
            print ""          # output blank line at end
        delete a              # clear a array
        del=0                 # set delete group flag 0
        next                  # get next record
    }
    /SEDS2-TOP/ {             # SEDS2-TOP matched in record
        del=1                 # set delete group flag 1
        delete a              # delete array a
        next                  # get next records
    }
    del==0 {                  # del group flag is zero
        a[$0]++               # add line as index to array a
    }
    END {                     # END rule - process last group of lines
        if (del==0) {         # if del group flag not set
            for (i in a)      # loop over lines in a
                print i       # output line (index)
            print ""          # with newline after
        }
    }' rowsets
    

    Example Use/Output

    Using your data file as input, you can simply select-copy the script above (and change the filename containing the row-sets from rowsets to whatever you have, then middle-mouse paste into your terminal in the directory with the file, e.g.

    $ awk 'NF==0 {              # empty line
    >     for (i in a)          # for each line in array a
    >         print i           # output line (index)
    >     if (i in a)           # if lines exists
    >         print ""          # output blank line at end
    >     delete a              # clear a array
    >     del=0                 # set delete group flag 0
    >     next                  # get next record
    > }
    > /SEDS2-TOP/ {             # SEDS2-TOP matched in record
    >     del=1                 # set delete group flag 1
    >     delete a              # delete array a
    >     next                  # get next records
    > }
    > del==0 {                  # del group flag is zero
    >     a[$0]++               # add line as index to array a
    > }
    > END {                     # END rule - process last group of lines
    >     if (del==0) {         # if del group flag not set
    >         for (i in a)      # loop over lines in a
    >             print i       # output line (index)
    >         print ""          # with newline after
    >     }
    > }' rowsets
    0.00  600.00  1500.00     0.00 1.00000 WATER-BOTTOM
    0.00  600.00  2214.28   785.71 1.00000 SEDS1-BOTTOM
    0.00  600.00  2214.28   785.71 1.00000 SEDS1-TOP
    
    0.00  400.00  2004.28   785.71 1.00000 SEDS1-BOTTOM
    0.00  300.00  2254.28   785.71 1.00000 SEDS1-TOP
    0.00  600.00  1600.00     0.00 1.00000 WATER-BOTTOM
    

    Preserving Row Order

    If preserving the row-order is needed, then instead of using the line as the index, you can introduce a new counter variable to be used as the index that would correspond to the row number in the array. That allows you to output the rows in their original order, e.g.

    awk -v ndx=1 '
    NF==0 {                   # empty line
        for (i=1; i 1)          # if lines exists
            print ""          # output blank line at end
        delete a              # clear a array
        del=0                 # set delete group flag 0
        ndx=1                 # reset array index 1
        next                  # get next record
    }
    /SEDS2-TOP/ {             # SEDS2-TOP matched in record
        del=1                 # set delete group flag 1
        delete a              # delete array a
        ndx=1                 # reset array index 1
        next                  # get next records
    }
    del==0 {                  # del group flag is zero
        a[ndx++]=$0           # add line to array a
    }
    END {                     # END rule - process last group of lines
        if (del==0) {         # if del group flag not set
            for (i=1; i

    In that case, your output would be:

    0.00  600.00  2214.28   785.71 1.00000 SEDS1-BOTTOM
    0.00  600.00  2214.28   785.71 1.00000 SEDS1-TOP
    0.00  600.00  1500.00     0.00 1.00000 WATER-BOTTOM
    
    0.00  400.00  2004.28   785.71 1.00000 SEDS1-BOTTOM
    0.00  300.00  2254.28   785.71 1.00000 SEDS1-TOP
    0.00  600.00  1600.00     0.00 1.00000 WATER-BOTTOM
    

    Look things over and let me know if you have further questions.

提交回复
热议问题