Let\'s say you want to save a bunch of files somewhere, for instance in BLOBs. Let\'s say you want to dish these files out via a web page and have the client automatically o
I 've tried a lot of examples but with Django mutagen plays nicely.
Example checking if files is mp3
from mutagen.mp3 import MP3, HeaderNotFoundError
try:
audio = MP3(file)
except HeaderNotFoundError:
raise ValidationError('This file should be mp3')
The downside is that your ability to check file types is limited, but it's a great way if you want not only check for file type but also to access additional information.
This seems to be very easy
>>> from mimetypes import MimeTypes
>>> import urllib
>>> mime = MimeTypes()
>>> url = urllib.pathname2url('Upload.xml')
>>> mime_type = mime.guess_type(url)
>>> print mime_type
('application/xml', None)
Please refer Old Post
Update - In python 3+ version, it's more convenient now:
import mimetypes
print(mimetypes.guess_type("sample.html"))
I'm surprised that nobody has mentioned it but Pygments is able to make an educated guess about the mime-type of, particularly, text documents.
Pygments is actually a Python syntax highlighting library but is has a method that will make an educated guess about which of 500 supported document types your document is. i.e. c++ vs C# vs Python vs etc
import inspect
def _test(text: str):
from pygments.lexers import guess_lexer
lexer = guess_lexer(text)
mimetype = lexer.mimetypes[0] if lexer.mimetypes else None
print(mimetype)
if __name__ == "__main__":
# Set the text to the actual defintion of _test(...) above
text = inspect.getsource(_test)
print('Text:')
print(text)
print()
print('Result:')
_test(text)
Output:
Text:
def _test(text: str):
from pygments.lexers import guess_lexer
lexer = guess_lexer(text)
mimetype = lexer.mimetypes[0] if lexer.mimetypes else None
print(mimetype)
Result:
text/x-python
Now, it's not perfect, but if you need to be able to tell which of 500 document formats are being used, this is pretty darn useful.
In Python 3.x and webapp with url to the file which couldn't have an extension or a fake extension. You should install python-magic, using
pip3 install python-magic
For Mac OS X, you should also install libmagic using
brew install libmagic
Code snippet
import urllib
import magic
from urllib.request import urlopen
url = "http://...url to the file ..."
request = urllib.request.Request(url)
response = urlopen(request)
mime_type = magic.from_buffer(response.readline())
print(mime_type)
alternatively you could put a size into the read
import urllib
import magic
from urllib.request import urlopen
url = "http://...url to the file ..."
request = urllib.request.Request(url)
response = urlopen(request)
mime_type = magic.from_buffer(response.read(128))
print(mime_type)
2017 Update
No need to go to github, it is on PyPi under a different name:
pip3 install --user python-magic
# or:
sudo apt install python3-magic # Ubuntu distro package
The code can be simplified as well:
>>> import magic
>>> magic.from_file('/tmp/img_3304.jpg', mime=True)
'image/jpeg'
@toivotuo 's method worked best and most reliably for me under python3. My goal was to identify gzipped files which do not have a reliable .gz extension. I installed python3-magic.
import magic
filename = "./datasets/test"
def file_mime_type(filename):
m = magic.open(magic.MAGIC_MIME)
m.load()
return(m.file(filename))
print(file_mime_type(filename))
for a gzipped file it returns: application/gzip; charset=binary
for an unzipped txt file (iostat data): text/plain; charset=us-ascii
for a tar file: application/x-tar; charset=binary
for a bz2 file: application/x-bzip2; charset=binary
and last but not least for me a .zip file: application/zip; charset=binary