How to get the PID of a process in a pipeline

前端 未结 9 516
北海茫月
北海茫月 2020-12-03 01:51

Consider the following simplified example:


my_prog|awk \'...\' > output.csv &
my_pid=\"$!\" #Gives the PID for awk instead of for my_prog
sleep 10
kill $my         


        
相关标签:
9条回答
  • 2020-12-03 02:22

    Based on your comment, I still can't see why you'd prefer killing my_prog to having it complete in an orderly fashion. Ten seconds is a pretty arbitrary measurement on a multiprocessing system whereby my_prog could generate 10k lines or 0 lines of output depending upon system load.

    If you want to limit the output of my_prog to something more determinate try

    my_prog | head -1000 | awk
    

    without detaching from the shell. In the worst case, head will close its input and my_prog will get a SIGPIPE. In the best case, change my_prog so it gives you the amount of output you want.

    added in response to comment:

    In so far as you have control over my_prog give it an optional -s duration argument. Then somewhere in your main loop you can put the predicate:

    if (duration_exceeded()) {
        exit(0);
    }
    

    where exit will in turn properly flush the output FILEs. If desperate and there is no place to put the predicate, this could be implemented using alarm(3), which I am intentionally not showing because it is bad.

    The core of your trouble is that my_prog runs forever. Everything else here is a hack to get around that limitation.

    0 讨论(0)
  • 2020-12-03 02:24

    Improving @Marvin's and @Nils Goroll's answers with a oneliner that extract the pids for all commands in the pipe into a shell array variable:

    # run some command
    ls -l | rev | sort > /dev/null &
    
    # collect pids
    pids=(`jobs -l % | egrep -o '^(\[[0-9]+\]\+|    ) [ 0-9]{5} ' | sed -e 's/^[^ ]* \+//' -e 's! $!!'`)
    
    # use them for something
    echo pid of ls -l: ${pids[0]}
    echo pid of rev: ${pids[1]}
    echo pid of sort: ${pids[2]}
    echo pid of first command e.g. ls -l: $pids
    echo pid of last command e.g. sort: ${pids[-1]}
    
    # wait for last command in pipe to finish
    wait ${pids[-1]}
    

    In my solution ${pids[-1]} contains the value normally available in $!. Please note the use of jobs -l % which outputs just the "current" job, which by default is the last one started.

    Sample output:

    pid of ls -l: 2725
    pid of rev: 2726
    pid of sort: 2727
    pid of first command e.g. ls -l: 2725
    pid of last command e.g. sort: 2727
    

    UPDATE 2017-11-13: Improved the pids=... command that works better with complex (multi-line) commands.

    0 讨论(0)
  • 2020-12-03 02:25

    Just had the same issue. My solution:

    process_1 | process_2 &
    PID_OF_PROCESS_2=$!
    PID_OF_PROCESS_1=`jobs -p`
    

    Just make sure process_1 is the first background process. Otherwise, you need to parse the full output of jobs -l.

    0 讨论(0)
  • 2020-12-03 02:32

    Add a shell wrapper around your command and capture the pid. For my example I use iostat.

    #!/bin/sh
    echo $$ > /tmp/my.pid
    exec iostat 1
    

    Exec replaces the shell with the new process preserving the pid.

    test.sh | grep avg
    

    While that runs:

    $ cat my.pid 
    22754
    $ ps -ef | grep iostat
    userid  22754  4058  0 12:33 pts/12   00:00:00 iostat 1
    

    So you can:

    sleep 10
    kill `cat my.pid`
    

    Is that more elegant?

    0 讨论(0)
  • 2020-12-03 02:34

    With inspiration from @Demosthenex's answer: using subshells:

    $ ( echo $BASHPID > pid1; exec vmstat 1 5 ) | tail -1 & 
    [1] 17371
    $ cat pid1
    17370
    $ pgrep -fl vmstat
    17370 vmstat 1 5
    
    0 讨论(0)
  • 2020-12-03 02:36

    I was desperately looking for good solution to get all the PIDs from a pipe job, and one promising approach failed miserably (see previous revisions of this answer).

    So, unfortunately, the best I could come up with is parsing the jobs -l output using GNU awk:

    function last_job_pids {
        if [[ -z "${1}" ]] ; then
            return
        fi
    
        jobs -l | awk '
            /^\[/ { delete pids; pids[$2]=$2; seen=1; next; }
            // { if (seen) { pids[$1]=$1; } }
            END { for (p in pids) print p; }'
    }
    
    0 讨论(0)
提交回复
热议问题