When i scrape page that contains products with usage of headless option i get different results.
For the same question one time i get results that are not sorted, and an
Ideally, using and not using firefox_options.headless = True
shouldn't have any major effect on the elements within the DOM Tree getting rendered but may have a significant difference as far as the Viewport is concerned.
As an example, when GeckoDriver/Firefox is initialized along with the --headless
option the default Viewport is width = 1366px, height = 768px
where as when GeckoDriver/Firefox is initialized without the --headless
option the default Viewport is width = 1382px, height = 744px
.
Example Code:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.FirefoxOptions()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.google.com/")
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, "q")))
print ("Headless Firefox Initialized")
size = driver.get_window_size()
print("Window size: width = {}px, height = {}px".format(size["width"], size["height"]))
driver.quit()
driver = webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.google.com/")
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, "q")))
print ("Firefox Initialized")
size = driver.get_window_size()
print("Window size: width = {}px, height = {}px".format(size["width"], size["height"]))
driver.quit()
Console Output:
Headless Firefox Initialized
Window size: width = 1366px, height = 768px
Firefox Initialized
Window size: width = 1382px, height = 744px
From the above observation it can be inferred that with --headless
option GeckoDriver/Firefox opens the Browsing Context with reduced Viewport and hence the number of elements identified can be less.
While using GeckoDriver/Firefox to initiate a Browsing Context always open in maximized
mode or configure through set_window_size()
as follows:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.FirefoxOptions()
options.headless = True
#options.add_argument("start-maximized")
options.add_argument("window-size=1400,600")
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.google.com/")
driver.set_window_size(1920, 1080)
You find a couple of relevant discussion on window size in: