I\'m working on a shell script that will be used by others, and may ingest suspect strings. It\'s based around awk, so as a basic resiliency measure, I want to have awk outp
There are three alternatives:
awk -vORS=$'\0'
but:$'\0'
is a construct specific to some shells (bash,zsh).awk -vORS=$'\0'
will not work in most older shells.There is the option to write it as: awk 'BEGIN { ORS = "\0" } ; { print $0 }'
, but that will not work with most awk versions.
Printing (printf
) with character \0
: awk '{printf( "%s\0", $0)}'
Printing directly ASCII 0
: awk '{ printf( "%s%c", $0, 0 )}'
Testing all alternatives with this code:
#!/bin/bash
test1(){ # '{printf( "%s%c",$0,0)}'|
a='awk,mawk,original-awk,busybox awk'
IFS=',' read -ra line <<<"$a"
for i in "${line[@]}"; do
printf "%14.12s %40s" "$i" "$1"
echo -ne "a\nb\nc\n" |
$i "$1"|
od -cAn;
done
}
#test1 '{print}'
test1 'BEGIN { ORS = "\0" } ; { print $0 }'
test1 '{ printf "%s\0", $0}'
test1 '{ printf( "%s%c", $0, 0 )}'
We get this results:
awk BEGIN { ORS = "\0" } ; { print $0 } a \0 b \0 c \0
mawk BEGIN { ORS = "\0" } ; { print $0 } a b c
original-awk BEGIN { ORS = "\0" } ; { print $0 } a b c
busybox awk BEGIN { ORS = "\0" } ; { print $0 } a b c
awk { printf "%s\0", $0} a \0 b \0 c \0
mawk { printf "%s\0", $0} a b c
original-awk { printf "%s\0", $0} a b c
busybox awk { printf "%s\0", $0} a b c
awk { printf( "%s%c", $0, 0 )} a \0 b \0 c \0
mawk { printf( "%s%c", $0, 0 )} a \0 b \0 c \0
original-awk { printf( "%s%c", $0, 0 )} a \0 b \0 c \0
busybox awk { printf( "%s%c", $0, 0 )} a b c
As it can be seen above, the first two solutions work only in GNU AWK.
The most portable is the third solution: '{ printf( "%s%c", $0, 0 )}'
.
No solution work correctly in "busybox awk".
The versions used for this tests were:
awk> GNU Awk 4.0.1
mawk> mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan
original-awk> awk version 20110810
busybox> BusyBox v1.20.2 (Debian 1:1.20.0-7) multi-call binary.
You can also pipe your awk's output through tr:
awk '{...code...}' infile | tr '\n' '\0' > outfile
Just tested, it works at least on Linux and FreeBSD.
If you cannot use newlines as separators (for example, if output records can contain newlines inside), just use some other character that's guaranteed not to appear inside a record, e.g. the one with code 1:
awk 'BEGIN { ORS="\001" } {...code...}' | tr '\001' '\0'
I've solved printing ASCII 0 from awk. I use UNIX command printf "\000"
echo | awk -v s='printf "\000"' '{system(s);}'
Alright, I've got it.
awk '{printf "%s\0", $0}'
Or, using ORS
,
awk -vORS=$'\0' //