问题
I have a script using a headless browser which I'm running using crontab -e
. It runs fine the first few times and then crashes with the following Traceback:
Traceback (most recent call last):
File "/home/clint-selenium-firefox.py", line 83, in <module>
driver.get(url)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 248, in get
self.execute(Command.GET, {'url': url})
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette
My crontab line is:
*/10 * * * * export DISPLAY=:0 && python /home/clint-selenium-firefox.py >> /home/error.log 2>&1
I don't want to overload this with the python script so I've pulled out what I think are the relevant bits.
from pyvirtualdisplay import Display
display = Display(visible=0, size=(800, 600))
display.start()
...
driver = webdriver.Firefox()
driver.get(url)
...
driver.quit()
...
display.stop()
Your help is much appreciated.
EDIT
Versions: Firefox 49.0.2; Selenium : 3.0.1; geckodriver: geckodriver-v0.11.1-linux64.tar.gz
Code around error (failing on driver.get(url)
):
driver = webdriver.Firefox()
if DEBUG: print "Opened Firefox"
for u in urls:
list_of_rows = []
list_of_old_rows = []
# get the old version of the site data
mycsvfile = u[1]
try:
with open(mycsvfile, 'r') as csvfile:
old_data = csv.reader(csvfile, delimiter=' ', quotechar='|')
for o in old_data:
list_of_old_rows.append(o)
except: pass
# get the new data
url = u[0]
if DEBUG: print url
driver.get(url)
if DEBUG: print driver.title
time.sleep(1)
page_source = driver.page_source
soup = bs4.BeautifulSoup(page_source,'html.parser')
回答1:
From Multiple Firefox instances failing with NS_ERROR_SOCKET_ADDRESS_IN_USE #99 This is because no --marionette-port option is passed to geckodriver - which means all instances of geckodriver launch firefox passing the same desired default port (2828). The first firefox instance binds to that port, future instances can't and all the geckodriver instances end up connecting to the first firefox instance - which produces all sorts of unpredictable behavior.
Followed by: I think a reasonable short-term solution is to do what the other drivers are doing and ask Marionette to bind to a randomised, free port generated by geckodriver. Currently it uses 2828 as the default for all instances it spawns of Firefox. Since Marionette unfortunately does not yet have an out-of-band way of communicating the port back to the client (geckodriver), this is inherently racy but we can improve the situation in the future with one of the proposals from bug 1240830.
This change was made in
Selenium 3.0.0.b2
* Updated Marionette port argument to match other drivers.
I guess random only works for so long. Raise an issue. A code fix may be required for the versions of selenium, firefox and geckodriver that you have. You could drop back to using Selenium 2.53.0 and firefox esr 38.8 until this is fixed. Your call.
UPDATE: Try
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
binary = FirefoxBinary('path/to/binary')
driver = webdriver.Firefox(firefox_binary=binary)
来源:https://stackoverflow.com/questions/40612091/headless-script-crashes-after-a-few-runs