How to loop over jq unique array in bash?

心已入冬 提交于 2021-02-10 17:49:20

问题


I'm trying to loop over unique names and commit messages from a github json object. However when there are spaces in the arrays bash will treat them as individual array items

#!/usr/bin/env bash

commits='[
  {
    "author": {
      "email": "email@example.com",
      "name": "Chris",
      "username": "chris"
    },
    "committer": {
      "email": "email@example.com",
      "name": "Chris",
      "username": "chris"
    },
    "message": "commit message 1"
  },
  {
    "author": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "committer": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "message": "commit message 2"
  },
    {
    "author": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "committer": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "message": "commit message 3"
  }
]'

authors=$( jq -rc '[.[].author.name] | unique | @sh' <<<"${commits}" )
echo "authors: $authors"

# this works
for author in $authors
do
  echo "author: $author"
done

echo "------------"

# this does not
messages=$( jq -rc '[.[].message] | unique | @sh' <<<"${commits}" )
echo "messages: $messages"

for message in $messages
do
  echo "message: $message"
done

Which outputs

authors: 'Chris' 'John'
author: 'Chris'
author: 'John'
------------
messages: 'commit message 1' 'commit message 2' 'commit message 3'
message: 'commit
message: message
message: 1'
message: 'commit
message: message
message: 2'
message: 'commit
message: message
message: 3'

While I expect:

authors: 'Chris' 'John'
author: 'Chris'
author: 'John'
------------
messages: 'commit message 1' 'commit message 2' 'commit message 3'
message: 'commit message 1'
message: 'commit message 2'
message: 'commit message 3'

回答1:


Works with change ' ' to '_' and back

messages=$( jq -rc '[.[].message] | unique | @sh' <<<"${commits}" )
messages="${messages// /_}"
messages=(${messages//"'_'"/"' '"})
echo "messages: ${messages[@]//_/ }"
for message in "${messages[@]//_/ }"
do
  echo " message: $message"
done

Or like this

IFS=$'\n' messages=( $(jq -rc '.[].message' <<<"${commits}") )
printf   "messages: "; printf "'%s' " "${messages[@]}"; echo
printf   " message: '%s' \n"          "${messages[@]}"

And we could do something like that

     IFS=$'\n'
 authors=($(jq -rc '.[].author.name' <<<"${commits}"))
messages=($(jq -rc '.[].message'     <<<"${commits}"))
printf " authors | "; printf "'%s' " "${authors[@]}" ; echo
printf "  author | '%s' \n"          "${authors[@]}"
echo   "---------+---------"
printf "messages | "; printf "'%s' " "${messages[@]}"; echo
printf " message | '%s' \n"          "${messages[@]}"

To output like this

 authors | 'Chris' 'John' 'John' 
  author | 'Chris' 
  author | 'John' 
  author | 'John' 
---------+---------
messages | 'commit message 1' 'commit message 2' 'commit message 3' 
 message | 'commit message 1' 
 message | 'commit message 2' 
 message | 'commit message 3' 



回答2:


Use readarray (Bash 4+) to map null delimited output from jq:

#!/usr/bin/env bash

commits='[
  {
    "author": {
      "email": "email@example.com",
      "name": "Chris",
      "username": "chris"
    },
    "committer": {
      "email": "email@example.com",
      "name": "Chris",
      "username": "chris"
    },
    "message": "commit message 1"
  },
  {
    "author": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "committer": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "message": "commit message 2"
  },
    {
    "author": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "committer": {
      "email": "email@example.com",
      "name": "John",
      "username": "jdoe"
    },
    "message": "commit message 3"
  }
]'

readarray -d '' authors < <(jq -j '.[].author.name + "\u0000"' <<<"${commits}")

for author in "${authors[@]}"
do
  echo "author: $author"
done

echo "------------"

readarray -d '' messages < <(jq -j '.[].message + "\u0000"' <<<"${commits}")

for message in "${messages[@]}"
do
  echo "message: $message"
done

Alternatively, if you have an older Bash version without readarray or mapfile you may separate the strings with the ASCII control character ETX (End of TeXt 03) and use read instead like this:

IFS=$'\03' read -d '' -ra authors < <(jq -j '.[].author.name + "\u0003"' <<<"${commits}")

IFS=$'\03' read -d '' -ra messages < <(jq -j '.[].message + "\u0003"' <<<"${commits}")

It is also possible to populate both arrays from a single jq call:

# Populates both arrays from a single jq call
{
  IFS=$'\03' read -r -d '' -a authors
  IFS=$'\03' read -r -d '' -a messages
} < <(jq -j '([.[].author.name] | unique | .[] + "\u0003"), "\u0000",  ([.[].message] | unique | .[] + "\u0003")' <<<"${commits}")

Explanation:

  • [.[].author.name] | unique | .[] + "\u0003":
    Output an ETX (03) delimited list of unique author names.

  • "\u0000": insert a null delimiter

  • [.[].message] | unique | .[] + "\u0003":
    Output an ETX (03) delimited list of unique messages.

  • Feeds the whole output of jq to a command group with two read commands.
    Each read will stop at the null delimiter or end of the stream.

{
  IFS=$'\03' read -r -d '' -a authors
  IFS=$'\03' read -r -d '' -a messages
}


来源:https://stackoverflow.com/questions/60226618/how-to-loop-over-jq-unique-array-in-bash

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!