mechanize-ruby

How do I scrape data through Mechanize and Nokogiri?

不羁的心 提交于 2019-12-11 09:13:53
问题 I am working on an application which gets the HTML from http://www.screener.in/. I can enter a company name like "Atul Auto Ltd" and submit it and, from the next page, scrape the following details: "CMP/BV" and "CMP". I am using this code: require 'mechanize' require 'rubygems' require 'nokogiri' Company_name='Atul Auto Ltd.' agent = Mechanize.new page = agent.get('http://www.screener.in/') form = agent.page.forms[0] print agent.page.forms[0].fields agent.page.forms[0]["q"]=Company_name

Getting error “getaddrinfo: No such host is known. (Socke tError)” with mechanize gem

我只是一个虾纸丫 提交于 2019-12-10 23:35:22
问题 I tried the below code: require 'mechanize' agent = Mechanize.new{|a| a.ssl_version, a.verify_mode = 'SSLv3', OpenSSL::SSL::VERIFY_NONE} page = agent.get "https://gegsltraining.aravo.com/" page=page.link_with(:dom_class => "button").click() But my bad getting the below error. D:\WIPData\Ruby\Scripts>mechanize_dowload.rb C:/Ruby193/lib/ruby/gems/1.9.1/gems/net-http-persistent-2.8/lib/net/http/persist ent/ssl_reuse.rb:29:in `initialize': getaddrinfo: No such host is known. (Socke tError) from C

Mechanize getting “Errno::ECONNRESET: Connection reset by peer - SSL_connect”

余生长醉 提交于 2019-12-10 18:34:07
问题 I'm unable to get Mechanize to load a page that used to work -- it's reliably failing with a Errno: ECONNRESET: Connection reset by peer - SSL_connect message. Any suggestions as to what I should try or details I should look at? (Please see "what I've tried" below...) Update 1 Taking a hint from a related S.O. post, I tried accessing the site directly with Net::HTTP . When I set http.ssl_version = :TLSv1 , I get a redirect rather than an error (as it should be). So my question becomes: how

How can I perform a Head request using mechanize in Ruby

别来无恙 提交于 2019-12-10 15:28:09
问题 I can perform a HEAD request with Faraday (Faraday.head url), but I am using Mechanize on my current project. I would like to grab a value from the header (filename) without downloading the file. Does the Mechanize gem provide such an option? I am using v2.0. 回答1: Just like get but it's head instead: page = agent.head 'http://www.google.com/' page.body.length #=> 0 page.header.keys #=> ["date", "expires", "cache-control", "content-type", "set-cookie", "p3p", "server", "x-xss-protection", "x

Scraping pages that do not seem to have URLs

走远了吗. 提交于 2019-12-09 03:47:27
问题 I'm trying to scrape these listings and provide more exposure for these job listings on a site that belongs to a client of mine. The issue is that I need to be able to link to the specific job listing in order for the job seeker to apply. This is the page I'm trying to save listing links from. It would be ideal if I could save an address for the job seeker to click on to see the original listing and then apply. What is this website doing to not feature a URL for these pages Is it possible to

How do I convert from a Mechanize::File object to a Mechanize::Page object?

淺唱寂寞╮ 提交于 2019-12-07 19:26:55
问题 I have a page that logs into a form. After logging in there are a few redirects. The first one looks like this: #<Mechanize::File:0x1f4ff23 @filename="MYL.html", @code="200", @response={"cache-control"=>"no-cache=\"set-cookie\"", "content-length"=>"114", "set-cookie"=>"JSESSIONID=GdJnPVnhtN91KZfQPc3QzM1NLCyWDsnyvpGg8LL0Knnz3RgqxLFs!1803804592!-2134626567; path=/; secure, COOKIE_TEST=Aslyn; secure", "x-powered-by"=>"Servlet/2.4 JSP/2.0"}, @body="\r\n<html>\r\n <head>\r\n <meta http-equiv=\

How do I convert from a Mechanize::File object to a Mechanize::Page object?

一世执手 提交于 2019-12-06 13:29:01
I have a page that logs into a form. After logging in there are a few redirects. The first one looks like this: #<Mechanize::File:0x1f4ff23 @filename="MYL.html", @code="200", @response={"cache-control"=>"no-cache=\"set-cookie\"", "content-length"=>"114", "set-cookie"=>"JSESSIONID=GdJnPVnhtN91KZfQPc3QzM1NLCyWDsnyvpGg8LL0Knnz3RgqxLFs!1803804592!-2134626567; path=/; secure, COOKIE_TEST=Aslyn; secure", "x-powered-by"=>"Servlet/2.4 JSP/2.0"}, @body="\r\n<html>\r\n <head>\r\n <meta http-equiv=\"refresh\" content=\"0;URL=MYL?Select=OK&StateName=38\">\r\n </head>\r\n</html>", @uri=#<URI::HTTPS

Iconv::IllegalSequence when using www::mechanize

北城以北 提交于 2019-12-06 09:32:21
问题 I'm trying to do a little bit of webscraping, but the WWW:Mechanize gem doesn't seem to like the encoding and crashes. The post request results in a 302 redirect (which mechanize follows, so far so good) and the resulting page seems to crash it. I googled quite a bit, but nothing came up so far how to solve this. Any of you got an idea? Code: require 'rubygems' require 'mechanize' agent = WWW::Mechanize.new agent.user_agent_alias = 'Mac Safari' answer = agent.post('https://www.budget.de/de

Ruby Mechanize: Follow a Link

我的未来我决定 提交于 2019-12-05 11:34:22
In Mechanize on Ruby, I have to assign a new variable to every new page I come to. For example: page2 = page1.link_with(:text => "Continue").click page3 = page2.link_with(:text => "About").click ...etc Is there a way to run Mechanize without a variable holding every page state? like my_only_page.link_with(:text => "Continue").click! my_only_page.link_with(:text => "About").click! Niels Kristian I don't know if I understand your question correctly, but if it's a matter of looping through a lot of pages dynamically and process them, you could do it like this: require 'mechanize' url = "http:/

Iconv::IllegalSequence when using www::mechanize

ε祈祈猫儿з 提交于 2019-12-04 15:58:59
I'm trying to do a little bit of webscraping, but the WWW:Mechanize gem doesn't seem to like the encoding and crashes. The post request results in a 302 redirect (which mechanize follows, so far so good) and the resulting page seems to crash it. I googled quite a bit, but nothing came up so far how to solve this. Any of you got an idea? Code: require 'rubygems' require 'mechanize' agent = WWW::Mechanize.new agent.user_agent_alias = 'Mac Safari' answer = agent.post('https://www.budget.de/de/reservierung/privatkunden/step1/schnellbuchung', {"Country" => "Deutschland", "Abholstation" => "Aalen",