I\'m trying to get my images thumbnailed and stored on s3 using django-storages, boto, and sorl-thumbnail. I have it working, but it\'s very slow, even with small images. I don\
As the author of sorl thumbnail I am really interested in solving this if it is not working as I intended. If the key value sotre is populated it will currently store: name, storage and size. I have made the assumption that the url is based on the name and thus should not cause any storage calls. Looking at django storages, https://github.com/e-loue/django-storages/blob/master/storages/backends/s3boto.py#L214 it seems like a safe assumption to make. In your patch you have patched the read method for some reason. When creating a thumbnail a ImageFile instance is fetched from cache (if not create it) then you can of course call read which will read the file, but the intended use is .url which calls url on the storage with the cached name which inturn should be a non storage access op. Could you try to isolate your problem to exacly where in your code this storage access happends?
Also make sure you have THUMBNAIL_DEBUG on and that you have the key value store properly set up.
After looking at the @shadfc django ticket, I reimplemented the monkeypatch as follows:
from django.core.files.images import ImageFile
def _get_image_dimensions(self):
if not hasattr(self, '_dimensions_cache'):
if getattr(self.storage, 'IGNORE_IMAGE_DIMENSIONS', False):
self._dimensions_cache = (0, 0)
else:
close = self.closed
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
return self._dimensions_cache
ImageFile._get_image_dimensions = _get_image_dimensions
To use it, just add a IGNORE_IMAGE_DIMENSIONS = True
to your storage class and it will not be touched to get image dimensions. Likely:
from storages.backends.s3boto import S3BotoStorage
S3BotoStorage.IGNORE_IMAGE_DIMENSIONS = True
I still need to investigate where the numbers are used, to know if simple returning (0, 0)
can lead to any problem, but no bug raised for now.
I'm not sure if you problem is the same as mine, but I found that accessing the width or height property of a normal Django ImageField would read the file from the storage backend, load it into PIL, and return the dimensions from there. This is especially costly with a remote backend like we're using, and we have very media-heavy pages.
https://code.djangoproject.com/ticket/8307 was opened to address this but the Django devs closed as wontfix because they want the width and height properties to always return the true values. So I just monkeypatch _get_image_dimensions() to use those fields, which does prevent a large number of the boto messages and improves my page-load times.
Below is my code modified from the patch attached to that ticket. I stuck this in a place which gets executed early, such as a models.py.
from django.core.files.images import ImageFile, get_image_dimensions
def _get_image_dimensions(self):
from numbers import Number
if not hasattr(self, '_dimensions_cache'):
close = self.closed
if self.field.width_field and self.field.height_field:
width = getattr(self.instance, self.field.width_field)
height = getattr(self.instance, self.field.height_field)
#check if the fields have proper values
if isinstance(width, Number) and isinstance(height, Number):
self._dimensions_cache = (width, height)
else:
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
else:
self.open()
self._dimensions_cache = get_image_dimensions(self, close=close)
return self._dimensions_cache
ImageFile._get_image_dimensions = _get_image_dimensions