Creating a headless Chrome instance in Python

后端 未结 5 601
陌清茗
陌清茗 2020-12-23 18:09

This question describes my conclusion after researching available options for creating a headless Chrome instance in Python and asks for confirmation or resources that descr

相关标签:
5条回答
  • 2020-12-23 18:31

    While I'm the author of CasperJS, I invite you to check out Ghost.py, a webkit web client written in Python.

    While it's heavily inspired by CasperJS, it's not based on PhantomJS — it still uses PyQt bindings and Webkit though.

    0 讨论(0)
  • 2020-12-23 18:34

    I use this to get the driver:

    def get_browser(storage_dir, headless=False):
        """
        Get the browser (a "driver").
    
        Parameters
        ----------
        storage_dir : str
        headless : bool
    
        Results
        -------
        browser : selenium webdriver object
        """
        # find the path with 'which chromedriver'
        path_to_chromedriver = '/usr/local/bin/chromedriver'
    
        from selenium.webdriver.chrome.options import Options
        chrome_options = Options()
        if headless:
            chrome_options.add_argument("--headless")
        chrome_options.add_experimental_option('prefs', {
            "plugins.plugins_list": [{"enabled": False,
                                      "name": "Chrome PDF Viewer"}],
            "download": {
                "prompt_for_download": False,
                "default_directory": storage_dir,
                "directory_upgrade": False,
                "open_pdf_in_system_reader": False
            }
        })
    
        browser = webdriver.Chrome(path_to_chromedriver,
                                   chrome_options=chrome_options)
        return browser
    

    By switching the headless parameter you can either watch it or not.

    0 讨论(0)
  • Any reason you haven't considered Selenium with the Chrome Driver?

    http://code.google.com/p/selenium/wiki/ChromeDriver

    http://code.google.com/p/selenium/wiki/PythonBindings

    0 讨论(0)
  • 2020-12-23 18:45

    casperjs is a headless webkit, but it wouldn't give you python bindings that I know of; it seems command-line oriented, but that doesn't mean you couldn't run it from python in such a way that satisfies what you are after. When you run casperjs, you provide a path to the javascript you want to execute; so you would need to emit that from Python.

    But all that aside, I bring up casperjs because it seems to satisfy the lightweight, headless requirement very nicely.

    0 讨论(0)
  • 2020-12-23 18:56

    This question is 5 years old now and at the time it was a big challenge to run a headless chrome using python, but the good news is:

    Starting from version 59, released in June 2017, Chrome comes with a headless driver, meaning we can use it in a non-graphical server environment and run tests without having pages visually rendered etc which saves a lot of time and memory for testing or scraping. Setting Selenium for that is very easy:

    (I assume that you have installed selenium and chrome driver):

    from selenium import webdriver
    
    #set a headless browser
    options = webdriver.ChromeOptions()
    options.add_argument('headless')
    browser = webdriver.Chrome(chrome_options=options)
    

    and now your chrome will run headlessly, if you take out options from the last line, it will show you the browser.

    0 讨论(0)
提交回复
热议问题