How to set the Referer header before loading a page with Ruby mechanize?

孤人 提交于 2019-12-10 10:13:04

问题


Is there a straightforward way to set custom headers with Mechanize 2.3?

I tried a former solution but get:

$agent = Mechanize.new
$agent.pre_connect_hooks << lambda { |p|
  p[:request]['Referer'] = 'https://wwws.mysite.com/cgi-bin/apps/Main'
} 

# ./mech.rb:30:in `<main>': undefined method `pre_connect_hooks' for nil:NilClass (NoMethodError)

回答1:


The docs say:

get(uri, parameters = [], referer = nil, headers = {}) { |page| ... }

so for example:

agent.get 'http://www.google.com/', [], agent.page.uri, {'foo' => 'bar'}

alternatively you might like:

agent.request_headers = {'foo' => 'bar'}
agent.get url



回答2:


You misunderstood the code you were copying. There was a newline in the example, but it disappeared in the formatting as it wasn't tagged as code. $agent contains nil since you're trying to use it before it has been initialized. You must initialize the object and then use it. Just try this:

$agent = Mechanize.new
$agent.pre_connect_hooks << lambda { |p| p[:request]['Referer'] = 'https://wwws.mysite.com/cgi-bin/apps/Main' }



回答3:


For this question I noticed people seem to use:

page = agent.get("http://www.you.com/index_login/", :referer => "http://www.you.com/")

As an aside, now that I tested this answer, it seems this was not the issue behind my actual problem: that every visit to a site I'm scraping requires going through the login sequence pages again, even seconds later after the first logged-in visit, despite that I'm always loading and saving the complete cookie jar in yaml format. But that would lead to another question of course.



来源:https://stackoverflow.com/questions/10124224/how-to-set-the-referer-header-before-loading-a-page-with-ruby-mechanize

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!