How do you download files, specifically .zip and .tar.gz, with Ruby and write them to the disk?
—
This question was originally specific to a bug in MacRuby, b
I've successfully downloaded and extracted GZip files with this code:
require 'open-uri'
require 'zlib'
open('tarball.tar', 'w') do |local_file|
open('http://github.com/jashkenas/coffee-script/tarball/master/tarball.tar.gz') do |remote_file|
local_file.write(Zlib::GzipReader.new(remote_file).read)
end
end
When downloading a .tar.gz
with open-uri
via a simple open()
call, I was also getting errors uncompressing the file on disk. I eventually noticed that the file size was much larger than expected.
Inspecting the file download.tar.gz
on disk, what it actually contained was download.tar
uncompressed; and that could be untarred. This seems to be due to an implicit Accept-encoding: gzip
header on the open()
call which makes sense for web content, but is not what I wanted when retrieving a gzipped tarball. I was able to work around it and defeat that behavior by sending a blank Accept-encoding
header in the optional hash argument to the remote open()
:
open('/local/path/to/download.tar.gz', 'wb') do |file|
# Send a blank Accept-encoding header
file.write open('https://example.com/remote.tar.gz', {'Accept-encoding'=>''}).read
end
I'd recommend using open-uri in ruby's stdlib.
require 'open-uri'
open(out_file, 'w') do |out|
out.write(open(url).read)
end
http://ruby-doc.org/stdlib/libdoc/open-uri/rdoc/classes/OpenURI/OpenRead.html#M000832
Make sure you look at the :progress_proc option to open as it looks like you want a progress hook.
The last time I got currupted files with Ruby was when I forgot to call file.binmode
right after File.open
. Took me hours to find out what was wrong. Does it help with your issue?