I have written an email form class in Django with a FileField. I want to check the uploaded file for its type via checking its mimetype. Subsequently, I want to limit file t
Why no trying something like that in your view :
m = magic.Magic()
m.from_buffer(request.FILES['my_file_field'].read())
Or use request.FILES
in place of form.cleaned_data
if django.forms.Form
is really not an option.
You can use django-safe-filefield package to validate that uploaded file extension match it MIME-type.
from safe_filefield.forms import SafeFileField
class MyForm(forms.Form):
attachment = SafeFileField(
allowed_extensions=('xls', 'xlsx', 'csv')
)
In case you're handling a file upload and concerned only about images,
Django will set content_type
for you (or rather for itself?):
from django.forms import ModelForm
from django.core.files import File
from django.db import models
class MyPhoto(models.Model):
photo = models.ImageField(upload_to=photo_upload_to, max_length=1000)
class MyForm(ModelForm):
class Meta:
model = MyPhoto
fields = ['photo']
photo = MyPhoto.objects.first()
photo = File(open('1.jpeg', 'rb'))
form = MyForm(files={'photo': photo})
if form.is_valid():
print(form.instance.photo.file.content_type)
It doesn't rely on content type provided by the user. But
django.db.models.fields.files.FieldFile.file
is an undocumented
property.
Actually, initially content_type
is set from the request, but when
the form gets validated, the value is updated.
Regarding non-images, doing request.FILES['name'].read()
seems okay to me.
First, that's what Django does. Second, files larger than 2.5 Mb by default
are stored on a disk. So let me point you at the other answer
here.
For the curious, here's the stack trace that leads to updating
content_type
:
django.forms.forms.BaseForm.is_valid: self.errors
django.forms.forms.BaseForm.errors: self.full_clean()
django.forms.forms.BaseForm.full_clean: self._clean_fields()
django.forms.forms.BaseForm._clean_fiels: field.clean()
django.forms.fields.FileField.clean: super().clean()
django.forms.fields.Field.clean: self.to_python()
django.forms.fields.ImageField.to_python
Stan described good variant with buffer. Unfortunately the weakness of this method is reading file to the memory. Another option is using temporary stored file:
import tempfile
import magic
with tempfile.NamedTemporaryFile() as tmp:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
Also, you might want to check the file size:
if form.cleaned_data['file'].size < ...:
print(magic.from_buffer(form.cleaned_data['file'].read()))
else:
# store to disk (the code above)
Additionally:
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later).
So you might want to handle it like so:
import os
tmp = tempfile.NamedTemporaryFile(delete=False)
try:
for chunk in form.cleaned_data['file'].chunks():
tmp.write(chunk)
print(magic.from_file(tmp.name, mime=True))
finally:
os.unlink(tmp.name)
tmp.close()
Also, you might want to seek(0) after read()
:
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Where uploaded data is stored
mime = magic.Magic(mime=True)
attachment = form.cleaned_data['attachment']
if hasattr(attachment, 'temporary_file_path'):
# file is temporary on the disk, so we can get full path of it.
mime_type = mime.from_file(attachment.temporary_file_path())
else:
# file is on the memory
mime_type = mime.from_buffer(attachment.read())
Also, you might want to seek(0) after read()
:
if hasattr(f, 'seek') and callable(f.seek):
f.seek(0)
Example from Django code. Performed for image fields during validation.