Consider the following command:
gawk -F"\t" "BEGIN{OFS=\"\t\"}{$2=$3=\"\"; print $0}" Input.tsv
When I set $2 = $3 = "", the intended effect to get the same effect as writing:
print $1,$4,$5...$NF
However, what actually happens is that I get two empty fields, with the extra field delimiters still printing.
Is it possible to actually delete $2 and $3?
Note: If this was on Linux in bash
, the correct statement above would be the following, but Windows does not handle single quotes well in cmd.exe
.
gawk -F'\t' 'BEGIN{OFS="\t"}{$2=$3=""; print $0}' Input.tsv
This is an oldie but goodie.
As Jonathan points out, you can't delete fields in the middle, but you can replace their contents with the contents of other fields. And you can make a reusable function to handle the deletion for you.
$ cat test.awk
function rmcol(col, i) {
for (i=col; i<NF; i++) {
$i=$(i+1)
}
NF--
}
{
rmcol(3)
}
1
$ printf 'one two three four\ntest red green blue\n' | awk -f test.awk
one two four
test red blue
You can't delete fields in the middle, but you can delete fields at the end, by decrementing NF
.
So you can shift all the later fields down to overwrite $2
and $3
then decrement NF
by two, which erases the last two fields:
$ echo 1 2 3 4 5 6 7 | awk '{for(i=2; i<NF-1; ++i) $i=$(i+2); NF-=2; print $0}'
1 4 5 6 7
If you're just looking to remove columns, you can use cut
:
cut -f 1,4- file.txt
To emulate cut
:
awk -F "\t" '{ for (i=1; i<=NF; i++) if (i != 2 && i != 3) { if (i == NF) printf $i"\n"; else printf $i"\t" } }' file.txt
Similar:
awk -F "\t" '{ delim =""; for (i=1; i<=NF; i++) if (i != 2 && i != 3) { printf delim $i; delim = "\t"; } printf "\n" }' file.txt
HTH
One way could be to remove fields like you do and remove extra spaces with gsub
:
awk 'BEGIN { FS = "\t" } { $2 = $3 = ""; gsub( /\s+/, "\t" ); print }' input-file
In the addition of the answer by Suicidal Steve I'd like to suggest one more solution but using sed instead awk.
It seems more complicated than usage of cut as it was suggested by Steve. But it was the better solution because sed -i allows editing in-place.
sed -i 's/\(.*,\).*,.*,\(.*\)/\1\2/' FILENAME
The only way I can think to do it in Awk without using a loop is to use gsub
on $0
to combine adjacent FS
:
$ echo {1..10} | awk '{$2=$3=""; gsub(FS"+",FS); print}'
1 4 5 6 7 8 9 10
well, if the goal is to remove the extra delimiters then you can use "tr" on Linux. Example:
$ echo "1,2,,,5" | tr -s ','
1,2,5
echo one two three four five six|awk '{
print $0
is3=$3
$3=""
print $0
print is3
}'
one two three four five six
one two four five six
three
来源:https://stackoverflow.com/questions/10693608/is-there-a-way-to-completely-delete-fields-in-awk-so-that-extra-delimiters-do-n