Ruby - net/http - following redirects

后端 未结 6 1030
北恋
北恋 2020-12-01 03:05

I\'ve got a URL and I\'m using HTTP GET to pass a query along to a page. What happens with the most recent flavor (in net/http) is that the script doesn\'t go

相关标签:
6条回答
  • 2020-12-01 03:19

    Given a URL that redirects

    url = 'http://httpbin.org/redirect-to?url=http%3A%2F%2Fhttpbin.org%2Fredirect-to%3Furl%3Dhttp%3A%2F%2Fexample.org'
    

    A. Net::HTTP

    begin
      response = Net::HTTP.get_response(URI.parse(url))
      url = response['location']
    end while response.is_a?(Net::HTTPRedirection)
    

    Make sure that you handle the case when there are too many redirects.

    B. OpenURI

    open(url).read
    

    OpenURI::OpenRead#open follows redirects by default, but it doesn't limit the number of redirects.

    0 讨论(0)
  • 2020-12-01 03:20

    Maybe you can use curb-fu gem here https://github.com/gdi/curb-fu the only thing is some extra code to make it follow redirect. I've used the following before. Hope it helps.

    require 'rubygems'
    require 'curb-fu'
    
    module CurbFu
      class Request
        module Base
          def new_meth(url_params, query_params = {})
            curb = old_meth url_params, query_params
            curb.follow_location = true
            curb
          end
    
          alias :old_meth :build
          alias :build :new_meth
        end
      end
    end
    
    #this should follow the redirect because we instruct
    #Curb.follow_location = true
    print CurbFu.get('http://<your path>/').body
    
    0 讨论(0)
  • 2020-12-01 03:21

    To follow redirects, you can do something like this (taken from ruby-doc)

    Following Redirection

    require 'net/http'
    require 'uri'
    
    def fetch(uri_str, limit = 10)
      # You should choose better exception.
      raise ArgumentError, 'HTTP redirect too deep' if limit == 0
    
      url = URI.parse(uri_str)
      req = Net::HTTP::Get.new(url.path, { 'User-Agent' => 'Mozilla/5.0 (etc...)' })
      response = Net::HTTP.start(url.host, url.port, use_ssl: true) { |http| http.request(req) }
      case response
      when Net::HTTPSuccess     then response
      when Net::HTTPRedirection then fetch(response['location'], limit - 1)
      else
        response.error!
      end
    end
    
    print fetch('http://www.ruby-lang.org/')
    
    0 讨论(0)
  • 2020-12-01 03:31

    The reference that worked for me is here: http://shadow-file.blogspot.co.uk/2009/03/handling-http-redirection-in-ruby.html

    Compared to most examples (including the accepted answer here), it's more robust as it handles URLs which are just a domain (http://example.com - needs to add a /), handles SSL specifically, and also relative URLs.

    Of course you would be better off using a library like RESTClient in most cases, but sometimes the low-level detail is necessary.

    0 讨论(0)
  • 2020-12-01 03:36

    I wrote another class for this based on examples given here, thank you very much everybody. I added cookies, parameters and exceptions and finally got what I need: https://gist.github.com/sekrett/7dd4177d6c87cf8265cd

    require 'uri'
    require 'net/http'
    require 'openssl'
    
    class UrlResolver
      def self.resolve(uri_str, agent = 'curl/7.43.0', max_attempts = 10, timeout = 10)
        attempts = 0
        cookie = nil
    
        until attempts >= max_attempts
          attempts += 1
    
          url = URI.parse(uri_str)
          http = Net::HTTP.new(url.host, url.port)
          http.open_timeout = timeout
          http.read_timeout = timeout
          path = url.path
          path = '/' if path == ''
          path += '?' + url.query unless url.query.nil?
    
          params = { 'User-Agent' => agent, 'Accept' => '*/*' }
          params['Cookie'] = cookie unless cookie.nil?
          request = Net::HTTP::Get.new(path, params)
    
          if url.instance_of?(URI::HTTPS)
            http.use_ssl = true
            http.verify_mode = OpenSSL::SSL::VERIFY_NONE
          end
          response = http.request(request)
    
          case response
            when Net::HTTPSuccess then
              break
            when Net::HTTPRedirection then
              location = response['Location']
              cookie = response['Set-Cookie']
              new_uri = URI.parse(location)
              uri_str = if new_uri.relative?
                          url + location
                        else
                          new_uri.to_s
                        end
            else
              raise 'Unexpected response: ' + response.inspect
          end
    
        end
        raise 'Too many http redirects' if attempts == max_attempts
    
        uri_str
        # response.body
      end
    end
    
    puts UrlResolver.resolve('http://www.ruby-lang.org')
    
    0 讨论(0)
  • 2020-12-01 03:38

    If you do not need to care about the details at each redirection, you can use the library Mechanize

    require 'mechanize'
    
    agent = Mechanize.new
    begin
        response = @agent.get(url)
    rescue Mechanize::ResponseCodeError
        // response codes other than 200, 301, or 302
    rescue Timeout::Error
    rescue Mechanize::RedirectLimitReachedError
    rescue StandardError
    end
    

    It will return the destination page. Or you can turn off redirection by this :

    agent.redirect_ok = false
    

    Or you can optionally change some settings at the request

    agent.user_agent = "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.106 Mobile Safari/537.36"
    
    0 讨论(0)
提交回复
热议问题