问题
Hi I am trying to scrap a web page "take the links" go to that links and "to scrap it" too.
require 'rubygems'
require 'scrapi'
require 'uri'
Scraper::Base.parser :html_parser
web = "http://......"
def sub_web(linksubweb)
uri = URI.parse(URI.encode(linksubweb))
end
scraper = Scraper.define do
array :items
process "div.mozaique>div", :items => Scraper.define {
process "p>a", :title => :text
process "div.thumb>a", :link => "@href"
result :title, :link,
}
result :items
end
uri = URI.parse(URI.encode(web))
scraper.scrape(uri).each do |pag|
link_full = uri + pag.link.to_str
puts pag.title
sub_web(link_full)
puts
end
And I have the following error
e $stdout.sync=true;$stderr.sync=true;load($0=ARGV.shift) /Users/sss/web/app/views/admin/topics/webconector.rb
Title 1
http://www.mydomain.com/user34/top5
/Users/sss/.rvm/rubies/ruby-1.9.3-p448/lib/ruby/1.9.1/uri/common.rb:304:in `escape': undefined method `gsub' for #<URI::HTTP:0x007fa07cb01e08> (NoMethodError)
from /Users/sss/.rvm/rubies/ruby-1.9.3-p448/lib/ruby/1.9.1/uri/common.rb:623:in `escape'
from ../app/views/admin/topics/conectaweb.rb:11:in `sub_web'
from ../app/views/admin/topics/conectaweb.rb:34:in `block in <top (required)>'
from ../views/admin/topics/conectaweb.rb:29:in `each'
from ../app/views/admin/topics/conectaweb.rb:29:in `<top (required)>'
from -e:1:in `load'
from -e:1:in `<main>'
Process finished with exit code 1
回答1:
try using uri = URI.parse(URI.encode(linksubweb.to_s))
this should work. The problem is that method requires a string argument so you have to first convert the URI::HTTP
object into string.
来源:https://stackoverflow.com/questions/18462667/in-escape-undefined-method-gsub-for-urihttp0x007fa07cb01e08-nomethod