mechanize-ruby

I can't remove whitespaces from a string parsed by Nokogiri

心已入冬 提交于 2019-12-04 13:13:26
问题 I can't remove whitespaces from a string. My HTML is: <p class='your-price'> Cena pro Vás: <strong>139 <small>Kč</small></strong> </p> My code is: #encoding: utf-8 require 'rubygems' require 'mechanize' agent = Mechanize.new site = agent.get("http://www.astratex.cz/podlozky-pod-raminka/doplnky") price = site.search("//p[@class='your-price']/strong/text()") val = price.first.text => "139 " val.strip => "139 " val.gsub(" ", "") => "139 " gsub , strip , etc. don't work. Why, and how do I fix

I can't remove whitespaces from a string parsed by Nokogiri

若如初见. 提交于 2019-12-03 07:22:39
I can't remove whitespaces from a string. My HTML is: <p class='your-price'> Cena pro Vás: <strong>139 <small>Kč</small></strong> </p> My code is: #encoding: utf-8 require 'rubygems' require 'mechanize' agent = Mechanize.new site = agent.get("http://www.astratex.cz/podlozky-pod-raminka/doplnky") price = site.search("//p[@class='your-price']/strong/text()") val = price.first.text => "139 " val.strip => "139 " val.gsub(" ", "") => "139 " gsub , strip , etc. don't work. Why, and how do I fix this? val.class => String val.dump => "\"139\\u{a0}\"" ! val.encoding => #<Encoding:UTF-8> __ENCODING__ =>

Clicking link with JavaScript in Mechanize

China☆狼群 提交于 2019-11-30 15:11:55
I have this: <a class="top_level_active" href="javascript:Submit('menu_home')">Account Summary</a> I want to click that link but I get an error when using link_to. I've tried: bot.click(page.link_with(:href => /menu_home/)) bot.click(page.link_with(:class => 'top_level_active')) bot.click(page.link_with(:href => /Account Summary/)) The error I get is: NoMethodError: undefined method `[]' for nil:NilClass That's a javascript link. Mechanize will not be able to click it, since it does not evaluate javascript. Sorry! Try to find out what happens in your browser when you click that link. Does it

getaddrinfo error with Mechanize

三世轮回 提交于 2019-11-30 15:02:47
I wrote a script that will go through all of the customers in our database, verify that their website URL works, and try to find a twitter link on their homepage. We have a little over 10,000 URLs to verify. After a fraction of if the urls are verified, we start getting getaddrinfo errors for every URL. Here's a copy of the code that scrapes a single URL: def scrape_url(url) url_found = false twitter_name = nil begin agent = Mechanize.new do |a| a.follow_meta_refresh = true end agent.get(normalize_url(url)) do |page| url_found = true twitter_name = find_twitter_name(page) end @err << "[#{