问题
How do I encode or 'escape' the URL before I use OpenURI to open(url)
?
We're using OpenURI to open a remote url and return the xml:
getresult = open(url).read
The problem is the URL contains some user-input text that contains spaces and other characters, including "+", "&", "?", etc. potentially, so we need to safely escape the URL. I saw lots of examples when using Net::HTTP, but have not found any for OpenURI.
We also need to be able to un-escape a similar string we receive in a session variable, so we need the reciprocal function.
回答1:
Ruby has the built-in URI library, and the Addressable gem, in particular Addressable::URI
I prefer Addressable::URI. It's very full featured and handles the encoding for you when you use the query_values=
method.
I've seen some discussions about URI going through some growing pains so I tend to leave it alone for handling encoding/escaping until these things get sorted out:
- http://osdir.com/ml/ruby-core/2010-06/msg00324.html
- http://osdir.com/ml/lang-ruby-core/2009-06/msg00350.html
- http://osdir.com/ml/ruby-core/2011-06/msg00748.html
回答2:
Don't use URI.escape
as it has been deprecated in 1.9.
Rails' Active Support adds Hash#to_query:
{foo: 'asd asdf', bar: '"<#$dfs'}.to_query
# => "bar=%22%3C%23%24dfs&foo=asd+asdf"
Also, as you can see it tries to order query parameters always the same way, which is good for HTTP caching.
回答3:
Ruby Standard Library to the rescue:
require 'uri'
user_text = URI.escape(user_text)
url = "http://example.com/#{user_text}"
result = open(url).read
See more at the docs for the URI::Escape module. It also has a method to do the inverse (unescape
)
回答4:
The main thing you have to consider is that you have to escape the keys and values separately before you compose the full URL.
All the methods which get the full URL and try to escape it afterwards are broken, because they cannot tell whether any &
or =
character was supposed to be a separator, or maybe a part of the value (or part of the key).
The CGI library seems to do a good job, except for the space character, which was traditionally encoded as +
, and nowadays should be encoded as %20
. But this is an easy fix.
Please, consider the following:
require 'cgi'
def encode_component(s)
# The space-encoding is a problem:
CGI.escape(s).gsub('+','%20')
end
def url_with_params(path, args = {})
return path if args.empty?
path + "?" + args.map do |k,v|
"#{encode_component(k.to_s)}=#{encode_component(v.to_s)}"
end.join("&")
end
def params_from_url(url)
path,query = url.split('?',2)
return [path,{}] unless query
q = query.split('&').inject({}) do |memo,p|
k,v = p.split('=',2)
memo[CGI.unescape(k)] = CGI.unescape(v)
memo
end
return [path, q]
end
u = url_with_params( "http://example.com",
"x[1]" => "& ?=/",
"2+2=4" => "true" )
# "http://example.com?x%5B1%5D=%26%20%3F%3D%2F&2%2B2%3D4=true"
params_from_url(u)
# ["http://example.com", {"x[1]"=>"& ?=/", "2+2=4"=>"true"}]
来源:https://stackoverflow.com/questions/4967608/in-ruby-rails-how-can-i-encode-escape-special-characters-in-urls