问题
I'm trying to find a way to determine the job "slot" or "core" a command is currently using in parallel
. For example, we've all seen a similar image of how parallel
distributes commands:
If I want to know which column a certain process is in, how do I know?
My specific problem illustrated: if set -j 4
to only allow 4 jobs to run at once, I want to dynamically know which slot a command is taking, 1 2 3 or 4. The problem is I have some commands that cannot run in parallel, but if I knew which slot I was running in, I'm all good.
Further example, say I have these commands I'm parallelizing:
command resource1 file1.rb
command resource2 file2.rb
command resource3 file3.rb
command resource4 file4.rb
command resource1 file5.rb
command resource2 file6.rb
command resource3 file7.rb
command resource4 file8.rb
Only one command can use each resource at a time. Say I put these commands in parallel
like usual with 4 jobs at a time, and job 3 finishes up and it goes to the next in the queue, I now have these running in parallel:
command resource1 file1.rb
command resource2 file2.rb
command resource3 file3.rb
command resource1 file5.rb
Notice resource1
is being used by two commands, not good. What I need is an environment variable or something to tell the next command to use resource number 4, so that the parallelized commands looks like this:
command resource1 file1.rb
command resource2 file2.rb
command resource3 file3.rb
command resource4 file5.rb
I've thought about using the filesystem or some other kind of external signs of which resources are in use, but I figure with parallel processes, there are likely to be race conditions going that route.
I've looked all over, any help is greatly appreciated!
回答1:
I believe you are looking for {%}:
parallel -j4 command ressource{%} file{}.rb ::: {1..8}
来源:https://stackoverflow.com/questions/28328597/gnu-parallel-how-do-determine-job-slot-youre-using