I have a command CMD called from my main bourne shell script that takes forever.
I want to modify the script as follows:
With this method, your script doesnt have to wait for the background process, you will only have to monitor a temporary file for the exit status.
FUNCmyCmd() { sleep 3;return 6; };
export retFile=$(mktemp);
FUNCexecAndWait() { FUNCmyCmd;echo $? >$retFile; };
FUNCexecAndWait&
now, your script can do anything else while you just have to keep monitoring the contents of retFile (it can also contain any other information you want like the exit time).
PS.: btw, I coded thinking in bash
My solution was to use an anonymous pipe to pass the status to a monitoring loop. There are no temporary files used to exchange status so nothing to cleanup. If you were uncertain about the number of background jobs the break condition could be [ -z "$(jobs -p)" ]
.
#!/bin/bash
exec 3<> <(:)
{ sleep 15 ; echo "sleep/exit $?" >&3 ; } &
while read -u 3 -t 1 -r STAT CODE || STAT="timeout" ; do
echo "stat: ${STAT}; code: ${CODE}"
if [ "${STAT}" = "sleep/exit" ] ; then
break
fi
done
I would change your approach slightly. Rather than checking every few seconds if the command is still alive and reporting a message, have another process that reports every few seconds that the command is still running and then kill that process when the command finishes. For example:
#!/bin/sh cmd() { sleep 5; exit 24; } cmd & # Run the long running process pid=$! # Record the pid # Spawn a process that coninually reports that the command is still running while echo "$(date): $pid is still running"; do sleep 1; done & echoer=$! # Set a trap to kill the reporter when the process finishes trap 'kill $echoer' 0 # Wait for the process to finish if wait $pid; then echo "cmd succeeded" else echo "cmd FAILED!! (returned $?)" fi
As I see almost all answers use external utilities (mostly ps
) to poll the state of the background process. There is a more unixesh solution, catching the SIGCHLD signal. In the signal handler it has to be checked which child process was stopped. It can be done by kill -0 <PID>
built-in (universal) or checking the existence of /proc/<PID>
directory (Linux specific) or using the jobs
built-in (bash specific. jobs -l
also reports the pid. In this case the 3rd field of the output can be Stopped|Running|Done|Exit . ).
Here is my example.
The launched process is called loop.sh
. It accepts -x
or a number as an argument. For -x
is exits with exit code 1. For a number it waits num*5 seconds. In every 5 seconds it prints its PID.
The launcher process is called launch.sh
:
#!/bin/bash
handle_chld() {
local tmp=()
for((i=0;i<${#pids[@]};++i)); do
if [ ! -d /proc/${pids[i]} ]; then
wait ${pids[i]}
echo "Stopped ${pids[i]}; exit code: $?"
else tmp+=(${pids[i]})
fi
done
pids=(${tmp[@]})
}
set -o monitor
trap "handle_chld" CHLD
# Start background processes
./loop.sh 3 &
pids+=($!)
./loop.sh 2 &
pids+=($!)
./loop.sh -x &
pids+=($!)
# Wait until all background processes are stopped
while [ ${#pids[@]} -gt 0 ]; do echo "WAITING FOR: ${pids[@]}"; sleep 2; done
echo STOPPED
For more explanation see: Starting a process from bash script failed
1: In bash, $!
holds the PID of the last background process that was executed. That will tell you what process to monitor, anyway.
4: wait <n>
waits until the process with PID <n>
is complete (it will block until the process completes, so you might not want to call this until you are sure the process is done), and then returns the exit code of the completed process.
2, 3: ps
or ps | grep " $! "
can tell you whether the process is still running. It is up to you how to understand the output and decide how close it is to finishing. (ps | grep
isn't idiot-proof. If you have time you can come up with a more robust way to tell whether the process is still running).
Here's a skeleton script:
# simulate a long process that will have an identifiable exit code
(sleep 15 ; /bin/false) &
my_pid=$!
while ps | grep " $my_pid " # might also need | grep -v grep here
do
echo $my_pid is still in the ps output. Must still be running.
sleep 3
done
echo Oh, it looks like the process is done.
wait $my_pid
# The variable $? always holds the exit code of the last command to finish.
# Here it holds the exit code of $my_pid, since wait exits with that code.
my_status=$?
echo The exit status of the process was $my_status
This is how I solved it when I had a similar need:
# Some function that takes a long time to process
longprocess() {
# Sleep up to 14 seconds
sleep $((RANDOM % 15))
# Randomly exit with 0 or 1
exit $((RANDOM % 2))
}
pids=""
# Run five concurrent processes
for i in {1..5}; do
( longprocess ) &
# store PID of process
pids+=" $!"
done
# Wait for all processes to finish, will take max 14s
# as it waits in order of launch, not order of finishing
for p in $pids; do
if wait $p; then
echo "Process $p success"
else
echo "Process $p fail"
fi
done