How does one reliably determine a file\'s type? File extension analysis is not acceptable. There must be a rubyesque tool similar to the UNIX file(1) comman
There is a ruby binding to libmagic
that does what you need. It is available as a gem named ruby-filemagic:
gem install ruby-filemagic
Require libmagic-dev
.
The documentation seems a little thin, but this should get you started:
$ irb
irb(main):001:0> require 'filemagic'
=> true
irb(main):002:0> fm = FileMagic.new
=> #<FileMagic:0x7fd4afb0>
irb(main):003:0> fm.file('foo.zip')
=> "Zip archive data, at least v2.0 to extract"
irb(main):004:0>
The ruby gem is well. mime-types for ruby
You can use this reliable method base on the magic header of the file :
def get_image_extension(local_file_path)
png = Regexp.new("\x89PNG".force_encoding("binary"))
jpg = Regexp.new("\xff\xd8\xff\xe0\x00\x10JFIF".force_encoding("binary"))
jpg2 = Regexp.new("\xff\xd8\xff\xe1(.*){2}Exif".force_encoding("binary"))
case IO.read(local_file_path, 10)
when /^GIF8/
'gif'
when /^#{png}/
'png'
when /^#{jpg}/
'jpg'
when /^#{jpg2}/
'jpg'
else
mime_type = `file #{local_file_path} --mime-type`.gsub("\n", '') # Works on linux and mac
raise UnprocessableEntity, "unknown file type" if !mime_type
mime_type.split(':')[1].split('/')[1].gsub('x-', '').gsub(/jpeg/, 'jpg').gsub(/text/, 'txt').gsub(/x-/, '')
end
end
Pure Ruby solution using magic bytes and returning a symbol for the matching type:
https://github.com/SixArm/sixarm_ruby_magic_number_type
I wrote it, so if you have suggestions, let me know.
If you're using the File class, you can augment it with the following functions based on @PatrickRichie's answer:
class File
def mime_type
`file --brief --mime-type #{self.path}`.strip
end
def charset
`file --brief --mime #{self.path}`.split(';').second.split('=').second.strip
end
end
And, if you're using Ruby on Rails, you can drop this into config/initializers/file.rb and have available throughout your project.
You could give shared-mime a try (gem install shared-mime-info). Requires the use ofthe Freedesktop shared-mime-info library, but does both filename/extension checks as well as "magic" checks... tried giving it a whirl myself just now but I don't have the freedesktop shared-mime-info database installed and have to do "real work," unfortunately, but it might be what you're looking for.