How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?

后端 未结 30 2328
悲哀的现实
悲哀的现实 2020-11-22 03:50

How to wait in a bash script for several subprocesses spawned from that script to finish and return exit code !=0 when any of the subprocesses ends with code !=0 ?

S

相关标签:
30条回答
  • 2020-11-22 04:35

    solution to wait for several subprocesses and to exit when any one of them exits with non-zero status code is by using 'wait -n'

    #!/bin/bash
    wait_for_pids()
    {
        for (( i = 1; i <= $#; i++ )) do
            wait -n $@
            status=$?
            echo "received status: "$status
            if [ $status -ne 0 ] && [ $status -ne 127 ]; then
                exit 1
            fi
        done
    }
    
    sleep_for_10()
    {
        sleep 10
        exit 10
    }
    
    sleep_for_20()
    {
        sleep 20
    }
    
    sleep_for_10 &
    pid1=$!
    
    sleep_for_20 &
    pid2=$!
    
    wait_for_pids $pid2 $pid1
    

    status code '127' is for non-existing process which means the child might have exited.

    0 讨论(0)
  • 2020-11-22 04:35

    I almost fell into the trap of using jobs -p to collect PIDs, which does not work if the child has already exited, as shown in the script below. The solution I picked was simply calling wait -n N times, where N is the number of children I have, which I happen to know deterministically.

    #!/usr/bin/env bash
    
    sleeper() {
        echo "Sleeper $1"
        sleep $2
        echo "Exiting $1"
        return $3
    }
    
    start_sleepers() {
        sleeper 1 1 0 &
        sleeper 2 2 $1 &
        sleeper 3 5 0 &
        sleeper 4 6 0 &
        sleep 4
    }
    
    echo "Using jobs"
    start_sleepers 1
    
    pids=( $(jobs -p) )
    
    echo "PIDS: ${pids[*]}"
    
    for pid in "${pids[@]}"; do
        wait "$pid"
        echo "Exit code $?"
    done
    
    echo "Clearing other children"
    wait -n; echo "Exit code $?"
    wait -n; echo "Exit code $?"
    
    echo "Waiting for N processes"
    start_sleepers 2
    
    for ignored in $(seq 1 4); do
        wait -n
        echo "Exit code $?"
    done
    

    Output:

    Using jobs
    Sleeper 1
    Sleeper 2
    Sleeper 3
    Sleeper 4
    Exiting 1
    Exiting 2
    PIDS: 56496 56497
    Exiting 3
    Exit code 0
    Exiting 4
    Exit code 0
    Clearing other children
    Exit code 0
    Exit code 1
    Waiting for N processes
    Sleeper 1
    Sleeper 2
    Sleeper 3
    Sleeper 4
    Exiting 1
    Exiting 2
    Exit code 0
    Exit code 2
    Exiting 3
    Exit code 0
    Exiting 4
    Exit code 0
    
    0 讨论(0)
  • 2020-11-22 04:36

    Here is simple example using wait.

    Run some processes:

    $ sleep 10 &
    $ sleep 10 &
    $ sleep 20 &
    $ sleep 20 &
    

    Then wait for them with wait command:

    $ wait < <(jobs -p)
    

    Or just wait (without arguments) for all.

    This will wait for all jobs in the background are completed.

    If the -n option is supplied, waits for the next job to terminate and returns its exit status.

    See: help wait and help jobs for syntax.

    However the downside is that this will return on only the status of the last ID, so you need to check the status for each subprocess and store it in the variable.

    Or make your calculation function to create some file on failure (empty or with fail log), then check of that file if exists, e.g.

    $ sleep 20 && true || tee fail &
    $ sleep 20 && false || tee fail &
    $ wait < <(jobs -p)
    $ test -f fail && echo Calculation failed.
    
    0 讨论(0)
  • 2020-11-22 04:36

    If you have bash 4.2 or later available the following might be useful to you. It uses associative arrays to store task names and their "code" as well as task names and their pids. I have also built in a simple rate-limiting method which might come handy if your tasks consume a lot of CPU or I/O time and you want to limit the number of concurrent tasks.

    The script launches all tasks in the first loop and consumes the results in the second one.

    This is a bit overkill for simple cases but it allows for pretty neat stuff. For example one can store error messages for each task in another associative array and print them after everything has settled down.

    #! /bin/bash
    
    main () {
        local -A pids=()
        local -A tasks=([task1]="echo 1"
                        [task2]="echo 2"
                        [task3]="echo 3"
                        [task4]="false"
                        [task5]="echo 5"
                        [task6]="false")
        local max_concurrent_tasks=2
    
        for key in "${!tasks[@]}"; do
            while [ $(jobs 2>&1 | grep -c Running) -ge "$max_concurrent_tasks" ]; do
                sleep 1 # gnu sleep allows floating point here...
            done
            ${tasks[$key]} &
            pids+=(["$key"]="$!")
        done
    
        errors=0
        for key in "${!tasks[@]}"; do
            pid=${pids[$key]}
            local cur_ret=0
            if [ -z "$pid" ]; then
                echo "No Job ID known for the $key process" # should never happen
                cur_ret=1
            else
                wait $pid
                cur_ret=$?
            fi
            if [ "$cur_ret" -ne 0 ]; then
                errors=$(($errors + 1))
                echo "$key (${tasks[$key]}) failed."
            fi
        done
    
        return $errors
    }
    
    main
    
    0 讨论(0)
  • 2020-11-22 04:36

    I think that the most straight forward way to run jobs in parallel and check for status is using temporary files. There are already a couple similar answers (e.g. Nietzche-jou and mug896).

    #!/bin/bash
    rm -f fail
    for i in `seq 0 9`; do
      doCalculations $i || touch fail &
    done
    wait 
    ! [ -f fail ]
    

    The above code is not thread safe. If you are concerned that the code above will be running at the same time as itself, it's better to use a more unique file name, like fail.$$. The last line is to fulfill the requirement: "return exit code 1 when any of subprocesses ends with code !=0?" I threw an extra requirement in there to clean up. It may have been clearer to write it like this:

    #!/bin/bash
    trap 'rm -f fail.$$' EXIT
    for i in `seq 0 9`; do
      doCalculations $i || touch fail.$$ &
    done
    wait 
    ! [ -f fail.$$ ] 
    

    Here is a similar snippet for gathering results from multiple jobs: I create a temporary directory, story the outputs of all the sub tasks in a separate file, and then dump them for review. This doesn't really match the question - I'm throwing it in as a bonus:

    #!/bin/bash
    trap 'rm -fr $WORK' EXIT
    
    WORK=/tmp/$$.work
    mkdir -p $WORK
    cd $WORK
    
    for i in `seq 0 9`; do
      doCalculations $i >$i.result &
    done
    wait 
    grep $ *  # display the results with filenames and contents
    
    0 讨论(0)
  • 2020-11-22 04:37

    wait also (optionally) takes the PID of the process to wait for, and with $! you get the PID of the last command launched in background. Modify the loop to store the PID of each spawned sub-process into an array, and then loop again waiting on each PID.

    # run processes and store pids in array
    for i in $n_procs; do
        ./procs[${i}] &
        pids[${i}]=$!
    done
    
    # wait for all pids
    for pid in ${pids[*]}; do
        wait $pid
    done
    
    0 讨论(0)
提交回复
热议问题