How to get the PID of a process in a pipeline

前端 未结 9 509
北海茫月 2020-12-03 01:51

Consider the following simplified example:

my_prog|awk \'...\' > output.csv &
my_pid=\"$!\" #Gives the PID for awk instead of for my_prog
sleep 10
kill $my         

  • 2020-12-03 02:39

    Here is a solution without wrappers or temporary files. This only works for a background pipeline whose output is captured away from stdout of the containing script, as in your case. Suppose you want to do:

    cmd1 | cmd2 | cmd3 >pipe_out &
    # do something with PID of cmd2

    If only bash could provide ${PIPEPID[n]}!! The replacement "hack" that I found is the following:

    PID=$( { cmd1 | { cmd2 0<&4 & echo $! >&3 ; } 4<&0 | cmd3 >pipe_out & } 3>&1 | head -1 )

    If needed, you can also close the fd 3 (for cmd*) and fd 4 (for cmd2) with 3>&- and 4<&-, respectively. If you do that, for cmd2 make sure you close fd 4 only after you redirect fd 0 from it.

    0 讨论(0)
  • 2020-12-03 02:41

    I was able to solve it with explicitly naming the pipe using mkfifo.

    Step 1: mkfifo capture.

    Step 2: Run this script

    my_prog > capture &
    my_pid="$!" #Now, I have the PID for my_prog!
    awk '...' capture > out.csv & 
    sleep 10
    kill $my_pid #kill my_prog
    wait #wait for awk to finish.

    I don't like the management of having a mkfifo. Hopefully someone has an easier solution.

    0 讨论(0)
  • 2020-12-03 02:43

    My solution was to query jobs and parse it using perl.
    Start two pipelines in the background:

    $ sleep 600 | sleep 600 |sleep 600 |sleep 600 |sleep 600 &
    $ sleep 600 | sleep 600 |sleep 600 |sleep 600 |sleep 600 &

    Query background jobs:

    $ jobs
    [1]-  Running                 sleep 600 | sleep 600 | sleep 600 | sleep 600 | sleep 600 &
    [2]+  Running                 sleep 600 | sleep 600 | sleep 600 | sleep 600 | sleep 600 &
    $ jobs -l
    [1]-  6108 Running                 sleep 600
          6109                       | sleep 600
          6110                       | sleep 600
          6111                       | sleep 600
          6112                       | sleep 600 &
    [2]+  6114 Running                 sleep 600
          6115                       | sleep 600
          6116                       | sleep 600
          6117                       | sleep 600
          6118                       | sleep 600 &

    Parse the jobs list of the second job %2. The parsing is probably error prone, but in these cases it works. We aim to capture the first number followed by a space. It is stored into the variable pids as an array using the parenthesis:

    $ pids=($(jobs -l %2 | perl -pe '/(\d+) /; $_=$1 . "\n"'))
    $ echo $pids
    $ echo ${pids[*]}
    6114 6115 6116 6117 6118
    $ echo ${pids[2]}
    $ echo ${pids[4]}

    And for the first pipeline:

    $ pids=($(jobs -l %1 | perl -pe '/(\d+) /; $_=$1 . "\n"'))
    $ echo ${pids[2]}
    $ echo ${pids[4]}

    We could wrap this into a little alias/function:

    function pipeid() { jobs -l ${1:-%%} | perl -pe '/(\d+) /; $_=$1 . "\n"'; }
    $ pids=($(pipeid))     # PIDs of last job
    $ pids=($(pipeid %1))  # PIDs of first job

    I have tested this in bash and zsh. Unfortunately, in bash I could not pipe the output of pipeid into another command. Probably because that pipeline is ran in a sub shell not able to query the job list??

    0 讨论(0)