What simple mechanism for synchronous Unix pooled processes?

后端未结

关注

 1  1113

I need to limit the number of processes being executed in parallel. For instance I\'d like to execute this psuedo-command line:

export POOL_PARALLELISM=4
for i


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  刺人心        
                
              
                            
                2021-02-10 18:11
              
            
            
                                                                       
There's definitely no need to write this tool yourself, there's several good choices.

make

make can do this pretty easy, but it does rely extensively on files to drive the process. (If you want to run some operation on every input file that produces an output file, this might be awesome.) The -j command line option will run the specified number of tasks and the -l load-average command line option will specify a system load average that must be met before starting new tasks. (Which might be nice if you wanted to do some work "in the background". Don't forget about the nice(1) command, which can also help here.)

So, a quick (and untested) Makefile for image converting:

ALL=$(patsubst cimg%.jpg,thumb_cimg%.jpg,$(wildcard *.jpg))

.PHONY: all

all: $(ALL)
        convert $< -resize 100x100 $@


If you run this with make, it'll run one-at-a-time. If you run with make -j8, it'll run eight separate jobs. If you run make -j, it'll start hundreds. (When compiling source code, I find that twice-the-number-of-cores is an excellent starting point. That gives each processor something to do while waiting for disk IO requests. Different machines and different loads might work differently.)

xargs

xargs provides the --max-procs command line option. This is best if the parallel processes can be divided apart based on a single input stream with either ascii NUL separated input commands or new-line separated input commands. (Well, the -d option lets you pick something else, but these two are common and easy.) This gives you the benefit of using find(1)'s powerful file-selection syntax rather than writing funny expressions like the Makefile example above, or lets your input be completely unrelated to files. (Consider if you had a program for factoring large composite numbers in prime factors -- making that task fit into make would be awkward at best. xargs could do it easily.)

The earlier example might look something like this:

find . -name '*jpg' -print0 | xargs -0 --max-procs 16 -I {} convert {} --resize 100x100 thumb_{}


parallel

The moreutils package (available at least on Ubuntu) provides the parallel command. It can run in two different ways: either running a specified command on different arguments, or running different commands in parallel. The previous example could look like this:

parallel -i -j 16 convert {} -resize 100x100 thumb_{} -- *.jpg


beanstalkd

The beanstalkd program takes a completely different approach: it provides a message bus for you to submit requests to, and job servers block on jobs being entered, execute the jobs, and then return to waiting for a new job on the queue. If you want to write data back to the specific HTTP request that initiated the job, this might not be very convenient, as you have to provide that mechanism yourself (perhaps a different 'tube' on the beanstalkd server), but if the end result is submitting data into a database, or email, or something similarly asynchronous, this might be the easiest to integrate into your existing application.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复

What simple mechanism for synchronous Unix pooled processes?

`make`

`xargs`

`parallel`

`beanstalkd`