I have a file containing lines as given below. I want to delete a set of rows from the file, if any line from a set of rows contains key word SEDS2-TOP. Each set of rows is
You can do it in awk
using 3-rules and the END
rule. It can be written as follows:
awk 'NF==0 { # empty line
for (i in a) # for each line in array a
print i # output line (index)
if (i in a) # if lines exists
print "" # output blank line at end
delete a # clear a array
del=0 # set delete group flag 0
next # get next record
}
/SEDS2-TOP/ { # SEDS2-TOP matched in record
del=1 # set delete group flag 1
delete a # delete array a
next # get next records
}
del==0 { # del group flag is zero
a[$0]++ # add line as index to array a
}
END { # END rule - process last group of lines
if (del==0) { # if del group flag not set
for (i in a) # loop over lines in a
print i # output line (index)
print "" # with newline after
}
}' rowsets
Example Use/Output
Using your data file as input, you can simply select-copy the script above (and change the filename containing the row-sets from rowsets
to whatever you have, then middle-mouse paste into your terminal in the directory with the file, e.g.
$ awk 'NF==0 { # empty line
> for (i in a) # for each line in array a
> print i # output line (index)
> if (i in a) # if lines exists
> print "" # output blank line at end
> delete a # clear a array
> del=0 # set delete group flag 0
> next # get next record
> }
> /SEDS2-TOP/ { # SEDS2-TOP matched in record
> del=1 # set delete group flag 1
> delete a # delete array a
> next # get next records
> }
> del==0 { # del group flag is zero
> a[$0]++ # add line as index to array a
> }
> END { # END rule - process last group of lines
> if (del==0) { # if del group flag not set
> for (i in a) # loop over lines in a
> print i # output line (index)
> print "" # with newline after
> }
> }' rowsets
0.00 600.00 1500.00 0.00 1.00000 WATER-BOTTOM
0.00 600.00 2214.28 785.71 1.00000 SEDS1-BOTTOM
0.00 600.00 2214.28 785.71 1.00000 SEDS1-TOP
0.00 400.00 2004.28 785.71 1.00000 SEDS1-BOTTOM
0.00 300.00 2254.28 785.71 1.00000 SEDS1-TOP
0.00 600.00 1600.00 0.00 1.00000 WATER-BOTTOM
Preserving Row Order
If preserving the row-order is needed, then instead of using the line as the index, you can introduce a new counter variable to be used as the index that would correspond to the row number in the array. That allows you to output the rows in their original order, e.g.
awk -v ndx=1 '
NF==0 { # empty line
for (i=1; i 1) # if lines exists
print "" # output blank line at end
delete a # clear a array
del=0 # set delete group flag 0
ndx=1 # reset array index 1
next # get next record
}
/SEDS2-TOP/ { # SEDS2-TOP matched in record
del=1 # set delete group flag 1
delete a # delete array a
ndx=1 # reset array index 1
next # get next records
}
del==0 { # del group flag is zero
a[ndx++]=$0 # add line to array a
}
END { # END rule - process last group of lines
if (del==0) { # if del group flag not set
for (i=1; i
In that case, your output would be:
0.00 600.00 2214.28 785.71 1.00000 SEDS1-BOTTOM
0.00 600.00 2214.28 785.71 1.00000 SEDS1-TOP
0.00 600.00 1500.00 0.00 1.00000 WATER-BOTTOM
0.00 400.00 2004.28 785.71 1.00000 SEDS1-BOTTOM
0.00 300.00 2254.28 785.71 1.00000 SEDS1-TOP
0.00 600.00 1600.00 0.00 1.00000 WATER-BOTTOM
Look things over and let me know if you have further questions.