问题
--headless option is not working for some link i don't know why by working for other links need of headless is i have to run this crawler on AWS instance to must required there is not GUI.. link= https://shop.nordstrom.com/s/pj-salvage-animal-lover-pajama-top-plus-size/5405170/full?origin=category-personalizedsort&breadcrumb=Home%2FWomen%2FClothing&color=charcoal
Using Headless Option
#for head less approch
options = Options()
options.binary_location = "/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"
options = Options()
options.add_argument("start-maximized")
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument("--hide-scrollbars")
options.add_argument("disable-infobars")
options.add_argument('--disable-dev-shm-usage')
options.add_argument("window-size=1920,1080")
prefs = {
"translate_whitelists": {"fr": "en", "de": "en", 'it': 'en', 'no': 'en', 'es': 'en', 'sv': 'en', 'nl': 'en',
'da': 'en', 'pl': 'en', 'fi': 'en', 'cs': 'en'},
"translate": {"enabled": "true"}
}
options.add_experimental_option("prefs", prefs)
driver = webdriver.Chrome(options=options)
#################################################################################
driver = webdriver.Chrome(options=options)
回答1:
As the latest version of chromedriver(After ChromeDriver 79.0.3945.16) fixed the issue element not clickable on headless mode.
Download the latest version of chromedriver.Hopefully your problem will be solved.
Here is the Changelog
Fixed ChromeDriver crash caused by javascript alert fired during command execution
Fixed a bug causing Chromedriver to lock when an alert is fired while taking a screenshot
Removed --ignore-certificate-errors from Chrome launch command Changed platform and platformName to windows on Win10
Fixed undefined window.navigator.webdriver when "enable-automation" is excluded
Fixed WPT test "test_not_editable_inputs[hidden]"
Fixed "Element is not clickable" when using headless mode
Chromedriver changelog and download link https://chromedriver.chromium.org/downloads
EDITED For aws You need to follow these steps
First, you need to install chrome using the following steps
sudo curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
sudo echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
sudo apt-get -y update
sudo apt-get -y install google-chrome-stable
Then Download chromedriver
wget https://chromedriver.storage.googleapis.com/index.html?path=79.0.3945.16/
unzip chromedriver_linux64.zip
Now add permission and move binary file
sudo mv chromedriver /usr/bin/chromedriver
sudo chown root:root /usr/bin/chromedriver
sudo chmod +x /usr/bin/chromedriver
For opening chrome change the binary path options.binary_location
and add argument options.add_argument('--headless')
NB: Do not forget to install java
回答2:
After searching a lot i notice those web-sites which send JS code first they have problem in opening in headless argument so i come up with solution to use virtualDisplay ..
Run this command in terminal or windows cmd or power-shell to install this
pip install PyVirtualDisplay
and this is the code you need to put in your code
from pyvirtualdisplay import Display
display = Display(visible=0, size=(800, 600))
display.start()
after you code in the end
display.stop()
来源:https://stackoverflow.com/questions/59047134/chrome-driver-headless-option-is-not-working-for-link