I have a mystery to solve when upgrading our Rails3.2 Ruby 1.9 app to a Rails3.2 Ruby 2.1.2 one. Nokogiri seems to break, in that it changes its behavior using open-uri. No gem versions are changed, just the ruby version (this is all on OSX Mavericks, using brew, gcc4 etc).
Steps to reproduce:
$ ruby -v
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-darwin13.1.0]
$ rails console
Connecting to database specified by database.yml
Loading development environment (Rails 3.2.18)
> feed = Nokogiri::XML(open(URI.encode("http://anyblog.wordpress.org/feed/")))
=> #(Document:0x3fcb82f08448 {
name = "document",
children = [
..
> feed.xpath("//item").count
=> 10
So all good! Next, after a rvm change to Ruby 2.1.2 and a bundle install..
$ ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
$ rails console
Connecting to database specified by database.yml
Loading development environment (Rails 3.2.18)
> feed = Nokogiri::XML(open(URI.encode("http://anyblog.wordpress.org/feed/")))
=>
> feed.inspect
=> "#<Nokogiri::XML::Document:0x86a1f21c name=\"document\">"
> feed.xpath("//item").count
=> 0
So it looks like the 'open' encoding has changed, in that a gzip http stream isn't being fed correctly to nokogiri? I checked with a nokogiri -v and it is using the packaged xml libs rather than system ones. Is this a open-uri Ruby 2.1.2 issue?
Another theory is that one of the gems has monkey patched open-uri to fix something in 1.9 and that is breaking 2.1? Help or ideas please!
EDIT: Here's more info not using Nokogiri, i.e. thinking this is more a open-uri issue on Ruby 2.1.2:
> open(url) {|f|
* f.each_line {|line| p line}
* p f.content_type
* p f.charset
* p f.content_encoding
* }
"\u001F\x8B\b\u0000\u0000\u0000\u0000\u0000\u0000\u0003\xED\x9D\xDBr\eW\xB2\xA6\xAF\xED\xA7\xA8\xCD\u001E\xB7/$\u0010..
(snip)
3\xF3\xA79\xA7\xFAɗ\xFF\u000F\xEAo\x9C\u0014k\xE8\u0000\u0000"
"text/xml"
"utf-8"
["gzip"]
=> ["gzip"]
..the 1.9 version was readable, i.e. gzip was applied already.
If I go into a clean ruby irb it works ok, so it must be something in my rails gems that is changing the behavior of open-uri open to not deflate/gzip. I have a lot of gems referenced.. :(
Ok, here's an answer, and maybe the answer. Ruby 2 changed how it uses headers in HTTP requests and zipping/deflating, but at some point they changed their minds back and put it to be how 1.9 worked. In the interim some Rails gem maintainers monkey patched HTTP:Net to make their gems work on both 1.9 and 2.0. Those monkey patches still linger in older versions of gems and cause issues like I saw upgrading from 1.9 to 2.1
A summary of the issue and solution here:
http://avi.io/blog/2013/12/17/do-not-upgrade-your-rails-project-to-ruby-2-before-you-read-this/
We use the gem right_aws, and the details of that issue with ruby versions is here:
https://github.com/sferik/twitter/issues/473
The solution was to undo the monkey patch using this as a gem reference in our Gemfile:
gem 'right_http_connection', git: 'git://github.com/rightscale/right_http_connection.git', ref: '3359524d81'
Background reading and more info:
来源:https://stackoverflow.com/questions/23664261/ruby-2-upgrade-breaks-nokogiri-and-or-open-uri-encoding