How to iterate through an in-memory zip file in Ruby

蓝咒 提交于 2019-12-08 15:24:47

问题


I am writing a unit test, and one of them is returning a zip file and I want to check the content of this zip file, grab some values from it, and pass the values to the next tests.

I'm using Rack Test, so I know the content of my zip file is inside last_response.body. I have looked through the documentation of RubyZip but it seems that it's always expecting a file. Since I'm running a unit test, I prefer to have everything done in the memory as not to pollute any folder with test zip files, if possible.


回答1:


See @bronson’s answer for a more up to date version of this answer using the newer RubyZip API.

The Rubyzip docs you linked to look a bit old. The latest release (0.9.9) can handle IO objects, so you can use a StringIO (with a little tweaking).

Even though the api will accept an IO, it still seems to assumes it’s a file and tries to call path on it, so first monkey patch StringIO to add a path method (it doesn’t need to actually do anything):

require 'stringio'
class StringIO
  def path
  end
end

Then you can do something like:

require 'zip/zip'
Zip::ZipInputStream.open_buffer(StringIO.new(last_response.body)) do |io|
  while (entry = io.get_next_entry)
    # deal with your zip contents here, e.g.
    puts "Contents of #{entry.name}: '#{io.read}'"
  end
end

and everything will be done in memory.




回答2:


Matt's answer is exactly right. Here it is updated to the new API:

Zip::InputStream.open(StringIO.new(input)) do |io|
  while entry = io.get_next_entry
    if entry.name == 'doc.kml'
      parse_kml(io.read)
    else
      raise "unknown entry in kmz file: #{entry.name}"
    end
  end
end

And there's no need to monkeypatch StringIO anymore. Progress!




回答3:


Zip::File.open_buffer(content) do |zip|
  zip.each do |entry|
    decompressed_data += entry.get_input_stream.read
  end
end



回答4:


With RubyZip version 1.2.1 (or maybe some previous versions too), we just need to use open_buffer method of Zip::File class.

From RubyZip documentation:

Like #open, but reads zip archive contents from a String or open IO stream, and outputs data to a buffer. (This can be used to extract data from a downloaded zip archive without first saving it to disk.)

Example:

Zip::File.open_buffer(last_response.body) do |zip|
  zip.each do |entry|
    puts entry.name
    # Do whatever you want with the content files.
  end
end



回答5:


This worked for me. In my case I have only one file so I used a fixed path, but you can use entry.name to build your path.

input = HTTParty.get(link).body
Zip::File.open_buffer(input) do |zip_file|
    zip_file.each do |entry|
      entry.extract(path)
    end
end



回答6:


You could use Tempfile to dump the zip file into a temporary file. Tempfile creates an operation-system specific temporary file which will be cleaned up by the OS after your program finishes.




回答7:


Just an update on this one due to changes at rubyzip:

Zip::InputStream.open(StringIO.new(zip_file)) do |io|
  while (entry = io.get_next_entry)
    # deal with your zip contents here, e.g.
    puts "Contents of #{entry.name}: '#{io.read}'"
  end
end



回答8:


Inspired by Matt's answer I have a slightly modified solution for those who have to use 0.9.x rubyzip gem. Mine doesn't require a new class definition.

sio = StringIO.new(response.body)
sio.define_singleton_method(:path) {} #needed to create fake method path TO satisfy the ancient rubyzip 0.9.8 gem
Zip::ZipInputStream::open_buffer(sio) { |io|
    while (entry = io.get_next_entry)
        puts "Contents of #{entry.name}"
     end
}


来源:https://stackoverflow.com/questions/13730720/how-to-iterate-through-an-in-memory-zip-file-in-ruby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!