Headless script crashes after a few runs

自闭症网瘾萝莉.ら 提交于 2019-12-23 07:42:50

问题


I have a script using a headless browser which I'm running using crontab -e. It runs fine the first few times and then crashes with the following Traceback:

Traceback (most recent call last):
  File "/home/clint-selenium-firefox.py", line 83, in <module>
    driver.get(url)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 248, in get
    self.execute(Command.GET, {'url': url})
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 192, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette

My crontab line is:

*/10 * * * * export DISPLAY=:0 && python /home/clint-selenium-firefox.py >> /home/error.log 2>&1

I don't want to overload this with the python script so I've pulled out what I think are the relevant bits.

from pyvirtualdisplay import Display

display = Display(visible=0, size=(800, 600))
display.start()
...
driver = webdriver.Firefox()
driver.get(url)
...
driver.quit()
...
display.stop()

Your help is much appreciated.

EDIT

Versions: Firefox 49.0.2; Selenium : 3.0.1; geckodriver: geckodriver-v0.11.1-linux64.tar.gz

Code around error (failing on driver.get(url)):

driver = webdriver.Firefox()
if DEBUG: print "Opened Firefox"

for u in urls:
    list_of_rows = []
    list_of_old_rows = []

    # get the old version of the site data
    mycsvfile = u[1]
    try:
        with open(mycsvfile, 'r') as csvfile:
            old_data = csv.reader(csvfile, delimiter=' ', quotechar='|')
            for o in old_data:
                list_of_old_rows.append(o)
    except: pass

    # get the new data
    url = u[0]
    if DEBUG: print url    

    driver.get(url)
    if DEBUG: print driver.title
    time.sleep(1)
    page_source = driver.page_source
    soup = bs4.BeautifulSoup(page_source,'html.parser')  

回答1:


From Multiple Firefox instances failing with NS_ERROR_SOCKET_ADDRESS_IN_USE #99 This is because no --marionette-port option is passed to geckodriver - which means all instances of geckodriver launch firefox passing the same desired default port (2828). The first firefox instance binds to that port, future instances can't and all the geckodriver instances end up connecting to the first firefox instance - which produces all sorts of unpredictable behavior.

Followed by: I think a reasonable short-term solution is to do what the other drivers are doing and ask Marionette to bind to a randomised, free port generated by geckodriver. Currently it uses 2828 as the default for all instances it spawns of Firefox. Since Marionette unfortunately does not yet have an out-of-band way of communicating the port back to the client (geckodriver), this is inherently racy but we can improve the situation in the future with one of the proposals from bug 1240830.

This change was made in

Selenium 3.0.0.b2
* Updated Marionette port argument to match other drivers.

I guess random only works for so long. Raise an issue. A code fix may be required for the versions of selenium, firefox and geckodriver that you have. You could drop back to using Selenium 2.53.0 and firefox esr 38.8 until this is fixed. Your call.

UPDATE: Try

from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary

binary = FirefoxBinary('path/to/binary')
driver = webdriver.Firefox(firefox_binary=binary)


来源:https://stackoverflow.com/questions/40612091/headless-script-crashes-after-a-few-runs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!