Storing JQ NULL-delimited output in bash array

廉价感情. 提交于 2019-12-06 09:43:28

问题


on bash 4.4.12 using jq 1.5 with this one-liner IFS=_ read -r -a a < <(jq -ncj '["a","b","c"][]+"_"') ; printf '%s\n' "${a[@]}" I get a properly delimited output

a

b

c

for elements a, b and c respectively, BUT if I try the same thing with a null delimiter like so: IFS= read -r -a a < <(jq -ncj '["a","b","c"][]+"\u0000"') ; printf '%s\n' "${a[@]}" then I would get only one array element containing

abc

Why doesn't this work like expected?

Furthermore, if you try IFS= read -r -d '' -a a < <(jq -ncj '["a","b","c"][]+"\u0000"') ; printf '%s\n' "${a[@]}, you will be surprised to get an array with only the first "a" element:

a

My goal is to find an approach without iterating over elements with any kind of a loop.

Edit: **readarray -d** is not a solution since i need the piece of code to run in bash prior to version 4.4


回答1:


Use readarray, which gained a -d analogous to the same option on read in bash 4.4:

$ readarray -d $'\0' -t a < <(jq -ncj '["a","b","c"][]+"\u0000"')
$ declare -p a
declare -a a=([0]="a" [1]="b" [2]="c")

-d '' works as well; since shell strings are null terminated, '' is, technically, the string containing the null character.


Without readarray -d support, you can use a while loop with read, which should work in any version of bash:

a=()
while read -d '' -r item; do
    a+=("$item")
done < <( jq -ncj '["a","b","c"][]+"\u0000"' )

This is the best you can do unless you know something about the array elements that would let you pick an alternate delimiter that isn't part of any of the elements.




回答2:


I'm assuming that you want to switch to using a null delimiter instead of _ in order to increase reliability of your scripts. However, the safest way to read json elements is not by using the null delimiter since that is allowed json text according to RFC7159 (page 8). E.g. if ["a","b","c"] were to look like ["a","b\u0000","c"] and you were to append the null char to each of the strings and parse these with a null delimiter, the "b" element would go into two separate bash array slots.

Instead, given that newlines are always escaped within json-strings when using e.g. jq -c I suggest relying on the part of the spec that says

"A string begins and ends with quotation marks."

With that in mind we can define:

jsonStripQuotes(){ local t0; while read -r t0; do t0="${t0%\"}"; t0="${t0#\"}"; printf '%s\n' "$t0"; done < <(jq '.');}

And then, e.g.

echo '["a\u0000 b\n","b\nnn","c d"]' | jq .[] | jsonStripQuotes

..should safely print each json string on separate lines(expanded newline appended), with all newlines and null's within the strings escaped. After that I would do a read with IFS set to newline only:

while IFS=$'\n' do read -r elem; Arr+=("$elem") ; done < <(echo '["a\u0000 b\n","b\nnn","c d"]' | jq .[] | stripJsonQuotes)

And then if you want to print them with newlines etc. expanded:

printf '%b' "${Arr[*]}"

I believe this is the most reliable way to parse json strings to a bash array.




回答3:


Since you are NOT using jq's -r option, the question arises as to whether the problem as posed in the title is perhaps an "XY" problem. If the goal is simply to assign JSON values to a bash array, consider:

$ readarray -t a < <(jq -nc '["a","b","c"][]') ; printf '%s\n' "${a[@]}"
"a"
"b"
"c"

Notice that the bash array values are recognizably JSON values (in this case, JSON strings, complete with the double-quotation marks).

Even more tellingly:

$ readarray -t a < <(jq -nc '["a\\b","\"b\"","c"][]') ; printf '%s\n' "${a[@]}"
"a\\b"
"\"b\""
"c"

Compare the loss of "JSONarity" that happens when using readarray with NUL:

$ readarray -d "" a < <(jq -ncj '["a\\b","\"b\"","c"][]+"\u0000"') ; printf '%s\n' "${a[@]}"
a\b
"b"
c


来源:https://stackoverflow.com/questions/49321015/storing-jq-null-delimited-output-in-bash-array

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!