Print a comma except on the last line in Awk

I have the following script

awk '{printf "%s", $1"-"$2", "}' $a >> positions;

where $a stores the name of the file. I am actually writing multiple column values into one row. However, I would like to print a comma only if I am not on the last line.

I would do it by finding the number of lines before running the script, e.g. with coreutils and bash:

awk -v nlines=$(wc -l < $a) '{printf "%s", $1"-"$2} NR != nlines { printf ", " }' $a >>positions

If your file only has 2 columns, the following coreutils alternative also works. Example data:

paste <(seq 5) <(seq 5 -1 1) | tee testfile

Output:

Now replacing tabs with newlines, paste easily assembles the date into the desired format:

 <testfile tr '\t' '\n' | paste -sd-,

Output:

1-5,2-4,3-3,4-2,5-1

Single pass approach:

cat "$a" | # look, I can use this in a pipeline! 
  awk 'NR > 1 { printf(", ") } { printf("%s-%s", $1, $2) }'

Note that I've also simplified the string formatting.

Enjoy this one:

awk '{printf t $1"-"$2} {t=", "}' $a >> positions

Yeh, looks a bit tricky at first sight. So I'll explain, first of all let's change printf onto print for clarity:

awk '{print t $1"-"$2} {t=", "}' file

and have a look what it does, for example, for file with this simple content:

1 A
2 B
3 C
4 D

so it will produce the following:

 1-A
 , 2-B
 , 3-C
 , 4-D

The trick is the preceding t variable which is empty at the beginning. The variable will be set {t=...} only on the next step of processing after it was shown {print t ...}. So if we (awk) continue iterating we will got the desired sequence.

You might think that awk's ORS and OFS would be a reasonable way to handle this:

$ awk '{print $1,$2}' OFS="-" ORS=", " input.txt

But this results in a final ORS because the input contains a newline on the last line. The newline is a record separator, so from awk's perspective there is an empty last record in the input. You can work around this with a bit of hackery, but the resultant complexity eliminates the elegance of the one-liner.

So here's my take on this. Since you say you're "writing multiple column values", it's possible that mucking with ORS and OFS would cause problems. So we can achieve the desired output entirely with formatting.

$ cat input.txt
3 2
5 4
1 8
$ awk '{printf "%s%d-%d",t,$1,$2; t=", "} END{print ""}' input.txt
3-2, 5-4, 1-8

This is similar to Michael's and rook's single-pass approaches, but it uses a single printf and correctly uses the format string for formatting.

This will likely perform negligibly better than Michael's solution because an assignment should take less CPU than a test, and noticeably better than any of the multi-pass solutions because the file only needs to be read once.

Here's a better way, without resorting to coreutils:

awk 'FNR==NR { c++; next } { ORS = (FNR==c ? "\n" : ", "); print $1, $2 }' OFS="-" file file

awk '{a[NR]=$1"-"$2;next}END{for(i=1;i<NR;i++){print a[i]", " }}' $a > positions

来源：https://stackoverflow.com/questions/14517930/print-a-comma-except-on-the-last-line-in-awk

标签

shell

awk

separator