How to wait in a bash script for several subprocesses spawned from that script to finish and return exit code !=0 when any of the subprocesses ends with code !=0 ?
S
I see lots of good examples listed on here, wanted to throw mine in as well.
#! /bin/bash
items="1 2 3 4 5 6"
pids=""
for item in $items; do
sleep $item &
pids+="$! "
done
for pid in $pids; do
wait $pid
if [ $? -eq 0 ]; then
echo "SUCCESS - Job $pid exited with a status of $?"
else
echo "FAILED - Job $pid exited with a status of $?"
fi
done
I use something very similar to start/stop servers/services in parallel and check each exit status. Works great for me. Hope this helps someone out!
I've just been modifying a script to background and parallelise a process.
I did some experimenting (on Solaris with both bash and ksh) and discovered that 'wait' outputs the exit status if it's not zero , or a list of jobs that return non-zero exit when no PID argument is provided. E.g.
Bash:
$ sleep 20 && exit 1 &
$ sleep 10 && exit 2 &
$ wait
[1]- Exit 2 sleep 20 && exit 2
[2]+ Exit 1 sleep 10 && exit 1
Ksh:
$ sleep 20 && exit 1 &
$ sleep 10 && exit 2 &
$ wait
[1]+ Done(2) sleep 20 && exit 2
[2]+ Done(1) sleep 10 && exit 1
This output is written to stderr, so a simple solution to the OPs example could be:
#!/bin/bash
trap "rm -f /tmp/x.$$" EXIT
for i in `seq 0 9`; do
doCalculations $i &
done
wait 2> /tmp/x.$$
if [ `wc -l /tmp/x.$$` -gt 0 ] ; then
exit 1
fi
While this:
wait 2> >(wc -l)
will also return a count but without the tmp file. This might also be used this way, for example:
wait 2> >(if [ `wc -l` -gt 0 ] ; then echo "ERROR"; fi)
But this isn't very much more useful than the tmp file IMO. I couldn't find a useful way to avoid the tmp file whilst also avoiding running the "wait" in a subshell, which wont work at all.
Exactly for this purpose I wrote a bash
function called :for
.
Note: :for
not only preserves and returns the exit code of the failing function, but also terminates all parallel running instance. Which might not be needed in this case.
#!/usr/bin/env bash
# Wait for pids to terminate. If one pid exits with
# a non zero exit code, send the TERM signal to all
# processes and retain that exit code
#
# usage:
# :wait 123 32
function :wait(){
local pids=("$@")
[ ${#pids} -eq 0 ] && return $?
trap 'kill -INT "${pids[@]}" &>/dev/null || true; trap - INT' INT
trap 'kill -TERM "${pids[@]}" &>/dev/null || true; trap - RETURN TERM' RETURN TERM
for pid in "${pids[@]}"; do
wait "${pid}" || return $?
done
trap - INT RETURN TERM
}
# Run a function in parallel for each argument.
# Stop all instances if one exits with a non zero
# exit code
#
# usage:
# :for func 1 2 3
#
# env:
# FOR_PARALLEL: Max functions running in parallel
function :for(){
local f="${1}" && shift
local i=0
local pids=()
for arg in "$@"; do
( ${f} "${arg}" ) &
pids+=("$!")
if [ ! -z ${FOR_PARALLEL+x} ]; then
(( i=(i+1)%${FOR_PARALLEL} ))
if (( i==0 )) ;then
:wait "${pids[@]}" || return $?
pids=()
fi
fi
done && [ ${#pids} -eq 0 ] || :wait "${pids[@]}" || return $?
}
for.sh
:
#!/usr/bin/env bash
set -e
# import :for from gist: https://gist.github.com/Enteee/c8c11d46a95568be4d331ba58a702b62#file-for
# if you don't like curl imports, source the actual file here.
source <(curl -Ls https://gist.githubusercontent.com/Enteee/c8c11d46a95568be4d331ba58a702b62/raw/)
msg="You should see this three times"
:(){
i="${1}" && shift
echo "${msg}"
sleep 1
if [ "$i" == "1" ]; then sleep 1
elif [ "$i" == "2" ]; then false
elif [ "$i" == "3" ]; then
sleep 3
echo "You should never see this"
fi
} && :for : 1 2 3 || exit $?
echo "You should never see this"
$ ./for.sh; echo $?
You should see this three times
You should see this three times
You should see this three times
1
I needed this, but the target process wasn't a child of current shell, in which case wait $PID
doesn't work. I did find the following alternative instead:
while [ -e /proc/$PID ]; do sleep 0.1 ; done
That relies on the presence of procfs, which may not be available (Mac doesn't provide it for example). So for portability, you could use this instead:
while ps -p $PID >/dev/null ; do sleep 0.1 ; done
If you have GNU Parallel installed you can do:
# If doCalculations is a function
export -f doCalculations
seq 0 9 | parallel doCalculations {}
GNU Parallel will give you exit code:
0 - All jobs ran without error.
1-253 - Some of the jobs failed. The exit status gives the number of failed jobs
254 - More than 253 jobs failed.
255 - Other error.
Watch the intro videos to learn more: http://pi.dk/1
The following code will wait for completion of all calculations and return exit status 1 if any of doCalculations fails.
#!/bin/bash
for i in $(seq 0 9); do
(doCalculations $i >&2 & wait %1; echo $?) &
done | grep -qv 0 && exit 1