Is it safe to run multiple instances of Puppeteer at the same time?

后端 未结 3 1658
小蘑菇
小蘑菇 2021-02-01 20:50

Is it safe/supported to run multiple instances of Puppeteer at the same time, either at

  1. the process level (multiple node screenshot.js at the same ti
3条回答
  •  挽巷
    挽巷 (楼主)
    2021-02-01 21:10

    It's fine to run multiple browser, contexts or even pages in parallel. The limits depend on your network/disk/memory and task setup.

    I crawled a few million pages and from time to time (in my setup, every ~10,000 pages) puppeteer will crash. Therefore, you should have a way to auto-restart the browser and retry the job.

    You might want to check out puppeteer-cluster, which takes care of pooling the browser instances, restarting and crash detection/restarting. (Disclaimer: I'm the author)

    An example of a creation of a cluster is below:

    // create a cluster that handles 10 parallel browsers
    const cluster = await Cluster.launch({
        concurrency: Cluster.CONCURRENCY_BROWSER,
        maxConcurrency: 10,
    });
    
    // Queue your jobs (one example)
    cluster.queue(async ({ page }) => {
        await page.goto('http://www.wikipedia.org');
        await page.screenshot({path: 'wikipedia.png'});
    });
    

    This is just a minimal example. There are many more ways to use the cluster.

提交回复
热议问题