问题
I have a script where I parallelize job execution while monitoring the progress. I do this using xargs
and a named fifo pipe. My problem is that I while xargs
performs well, some lines written to the pipe are lost. Any idea what the problem is?
For example the following script (basically my script with dummy data) will produce the following output and hangs at the end waiting for those missing lines:
$ bash test2.sh
Progress: 0 of 99
DEBUG: Processed data 0 in separate process
Progress: 1 of 99
DEBUG: Processed data 1 in separate process
Progress: 2 of 99
DEBUG: Processed data 2 in separate process
Progress: 3 of 99
DEBUG: Processed data 3 in separate process
Progress: 4 of 99
DEBUG: Processed data 4 in separate process
Progress: 5 of 99
DEBUG: Processed data 5 in separate process
DEBUG: Processed data 6 in separate process
DEBUG: Processed data 7 in separate process
DEBUG: Processed data 8 in separate process
Progress: 6 of 99
DEBUG: Processed data 9 in separate process
Progress: 7 of 99
##### Script is hanging here (Could happen for any line) #####
#!/bin/bash
clear
printStateInLoop() {
local pipe="$1"
local total="$2"
local finished=0
echo "Progress: $finished of $total"
while true; do
if [ $finished -ge $total ]; then
break
fi
let finished++
read line <"$pipe"
# In final script I would need to do more than just logging
echo "Progress: $finished of $total"
done
}
processData() {
local number=$1
local pipe=$2
sleep 1 # Work needs time
echo "$number" >"$pipe"
echo "DEBUG: Processed data $number in separate process"
}
export -f processData
process() {
TMP_DIR=$(mktemp -d)
PROGRESS_PIPE="$TMP_DIR/progress-pipe"
mkfifo "$PROGRESS_PIPE"
DATA_VECTOR=($(seq 0 1 99)) # A bunch of data
printf '%s\0' "${DATA_VECTOR[@]}" | xargs -0 --max-args=1 --max-procs=5 -I {} bash -c "processData \$@ \"$PROGRESS_PIPE\"" _ {} &
printStateInLoop "$PROGRESS_PIPE" ${#DATA_VECTOR[@]}
}
process
rm -Rf "$TMP_DIR"
In another post I got the suggestion to switch to while read line; do … done < "$pipe"
(function below) instead of while true; do … read line < "$pipe" … done
to not close the pipeline on every line read. This reduces the frequency of the problem but still it happens: Some Lines are missing and sometimes a xargs: bash: terminated by signal 13
.
printStateInLoop() {
local pipe="$1"
local total="$2"
local finished=0
echo "Progress: $finished of $total"
while [ $finished -lt $total ]; do
while read line; do
let finished++
# In final script I would need to do more than just logging
echo "Progress: $finished of $total"
done <"$pipe"
done
}
A lot of people on SO suggested to use parallel or pv for doing this. Sadly those tools aren't available on the very limited target platform. Instead my script is based on xargs
.
回答1:
The solution (as pointed out by @markp-fuso and @Dale) was to create a file lock.
Instead of:
echo "$number" >"$pipe"
I now use flock
to create/wait for a lock first:
flock "$pipe.lock" echo "$number" >"$pipe"
来源:https://stackoverflow.com/questions/64743111/writing-from-multiple-processes-launched-via-xargs-to-the-same-fifo-pipe-causes