Python parallel execution with selenium

前端 未结 2 760
夕颜
夕颜 2020-11-30 10:25

I\'m confused about parallel execution in python using selenium. There seems to be a few ways to go about it, but some seem out of date.

I\'m wondering what is the l

相关标签:
2条回答
  • 2020-11-30 11:11

    I created a project to do this and it reuses webdriver instances for better performance:

    https://github.com/testlabauto/local_selenium_pool

    https://pypi.org/project/local-selenium-pool/

    0 讨论(0)
  • 2020-11-30 11:26

    Use joblib's Parallel module to do that, its a great library for parallel execution.

    Lets say we have a list of urls named urls and we want to take a screenshot of each one in parallel

    First lets import the necessary libraries

    from selenium import webdriver
    from joblib import Parallel, delayed
    

    Now lets define a function that takes a screenshot as base64

    def take_screenshot(url):
        phantom = webdriver.PhantomJS('/path/to/phantomjs')
        phantom.get(url)
        screenshot = phantom.get_screenshot_as_base64()
        phantom.close()
    
        return screenshot
    

    Now to execute that in parallel what you would do is

    screenshots = Parallel(n_jobs=-1)(delayed(take_screenshot)(url) for url in urls)
    

    When this line will finish executing, you will have in screenshots all of the data from all of the processes that ran.

    Explanation about Parallel

    • Parallel(n_jobs=-1) means use all of the resources you can
    • delayed(function)(input) is joblib's way of creating the input for the function you are trying to run on parallel

    More information can be found on the joblib docs

    0 讨论(0)
提交回复
热议问题