Selenium Webdriver vs Mechanize

余生长醉 提交于 2019-11-29 02:34:55

问题


I am interested in automating repetitive data entry in some forms for a website I frequent. So far the tools I've looked up that would provide support for this in a headless fashion could be Selenium WebDriver and Mechanize.

My question is, is there a fundamental technical difference in using once versus the other? Selenium is mostly used for testing. I've also noticed some folks use it for doing exactly what I'm looking for, and that's automating data entry. Testing becomes a second benefit in that case.

Is there reasons to not use Selenium for what I want to do over Mechanize? Does it not matter and both of these tools will work?

I'm not asking which is better, I'm asking which is the right tool for the job. Perhaps I'm not understanding the premise behind the purpose of each tool.


回答1:


These are completely different tools that somewhat "cross" in the web-scraping, web automation, automated data extraction scope.

mechanize is a mature and widely-used tool for programmatic web-browsing with a lot of built-in features, like cookie handing, browser history, form submissions. The key thing to understand here is that mechanize.Browser is not a real browser, it cannot execute and understand javascript, it cannot send asynchronous requests often needed to form a web page.

This is where selenium comes into play - it is a browser automation tool which is also widely used in web-scraping. selenium usually becomes a "fall-back" tool - when someone cannot web-scrape a site with mechanize or RoboBrowser or MechanicalSoup (note - another alternatives) because of, for instance, it's javascript "heaviness", the choice is usually selenium. With selenium you can also go headless, automating PhantomJS browser, or having a virtual display. As a commonly mentioned drawback, performance is often mentioned - with selenium you are working with a target site as a real user in a web browser, which is loading additional files needed to form a page, making XHR requests, rendering etc.

And this itself does not mean you should use selenium everywhere - choose the tool wisely, choose it because it fits the problem better, not because you are more familiar with an instrument.


Also note that you should, first, consider using an API (if provided by the target website) instead of going down to web-scraping. And, if it comes to it, be a good web-scraping citizen:

  • How to be a good citizen when crawling web sites?
  • Web scraping etiquette


来源:https://stackoverflow.com/questions/31530335/selenium-webdriver-vs-mechanize

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!