Nodejs batch processing

拈花ヽ惹草 提交于 2020-08-23 05:28:11

问题


A bit of conceptual question

I have 15 (for example) files that need to be processed. But i dont want to process them one at a time. Instead i want to start processing 5 of them (any 5 the order is not important) and as long one of these 5 files is processed another one to be started. The idea is to have max 5 files being processed at the same time until all files are processed.

Trying to work this out in Node but in general im missing the idea how this can be implemented


回答1:


Here's a little example that simulates multiple workers reading from a central queue of work: https://jsfiddle.net/ctrlfrk/jsvyg69h/1/

// Fake "work" that is simply a task that takes as many milliseconds as its value.
const workQueue = [1000,4000,2000,4000,5000,3000,7000,1000,9000,9000,4000,2000,1000,3000,8000,2000,3000,7000,6000,30000];


const Worker = (name) => (channel) => {
  const history = [];
  const next = () => {
    const job = channel.getWork();
    if (!job) { // All done!
      console.log('Worker ' + name + ' completed');
      return;
    }
    history.push(job);
    console.log('Worker ' + name + ' grabbed new job:' + job +'. History is:', history);

    window.setTimeout(next, job); //job is just the milliseconds.
  };
  next();
}

const Channel = (queue) => {
  return { getWork: () => {
    return queue.pop();
  }};
};

let channel = Channel(workQueue);
let a = Worker('a')(channel);
let b = Worker('b')(channel);
let c = Worker('c')(channel);
let d = Worker('d')(channel);



回答2:


A more accurate name for this type of processing might be 'limited parallel execution'. Mario Casciaro covers this well in his book, Node.js Design Patterns beginning on page 77. One use case for this pattern is when you want to control a set of parallel tasks that could cause excessive load. The example below is from his book.

Limited Parallel Execution Pattern

function TaskQueue(concurrency) {
  this.concurrency = concurrency;
  this.running = 0;
  this.queue = [];
}

TaskQueue.prototype.pushTask = function(task, callback) {
  this.queue.push(task);
  this.next();
}

TaskQueue.prototype.next = function() {
  var self = this;
  while(self.running < self.concurrency && self.queue.length) {
    var task = self.queue.shift();
    task(function(err) {
      self.running--;
      self.next();
    });
    self.running++;
  }
}



回答3:


You can do what you want by code below, but I am confused why you want to do this?

  function handle(file) {
    new Promise(function(resolve, reject) {
      doSomething(file, function(err) {
        if(err)
          reject(err);
        else
          resolve();
      });
    })
    .then(function() {
      handle(files.shift());
    });
  }

  var files = [1, 2, ....., 15];
  var max = 5;
  while(max--) {
    handle(files.shift());
  }


来源:https://stackoverflow.com/questions/39564675/nodejs-batch-processing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!