How do you cache a paginated Django queryset, specifically in a ListView?
I noticed one query was taking a long time to run, so I\'m attempting to cache it. The queryset
The problem turned out to be a combination of factors. Mainly, the result returned by the paginate_queryset()
contains a reference to the unlimited queryset, meaning it's essentially uncachable. When I called cache.set(mykey, (paginator, page, object_list, other_pages))
, it was trying to serialize thousands of records instead of just the page_size
number of records I was expecting, causing the cached item to exceed memcached's limits and fail.
The other factor was the horrible default error reporting in the memcached/python-memcached, which silently hides all errors and turns cache.set() into a nop if anything goes wrong, making it very time-consuming to track down the problem.
I fixed this by essentially rewriting paginate_queryset()
to ditch Django's builtin paginator functionality altogether and calculate the queryset myself with:
object_list = queryset[page_size*(page-1):page_size*(page-1)+page_size]
and then caching that object_list
.
You can extend the Paginator
to support caching by a provided cache_key
.
A blog post about usage and implementation of a such CachedPaginator
can be found here. The source code is posted at djangosnippets.org (here is a web-acrhive link because the original is not working).
However I will post a slightly modificated example from the original version, which can not only cache objects per page, but the total count too. (sometimes even the count can be an expensive operation).
from django.core.cache import cache
from django.utils.functional import cached_property
from django.core.paginator import Paginator, Page, PageNotAnInteger
class CachedPaginator(Paginator):
"""A paginator that caches the results on a page by page basis."""
def __init__(self, object_list, per_page, orphans=0, allow_empty_first_page=True, cache_key=None, cache_timeout=300):
super(CachedPaginator, self).__init__(object_list, per_page, orphans, allow_empty_first_page)
self.cache_key = cache_key
self.cache_timeout = cache_timeout
@cached_property
def count(self):
"""
The original django.core.paginator.count attribute in Django1.8
is not writable and cant be setted manually, but we would like
to override it when loading data from cache. (instead of recalculating it).
So we make it writable via @cached_property.
"""
return super(CachedPaginator, self).count
def set_count(self, count):
"""
Override the paginator.count value (to prevent recalculation)
and clear num_pages and page_range which values depend on it.
"""
self.count = count
# if somehow we have stored .num_pages or .page_range (which are cached properties)
# this can lead to wrong page calculations (because they depend on paginator.count value)
# so we clear their values to force recalculations on next calls
try:
del self.num_pages
except AttributeError:
pass
try:
del self.page_range
except AttributeError:
pass
@cached_property
def num_pages(self):
"""This is not writable in Django1.8. We want to make it writable"""
return super(CachedPaginator, self).num_pages
@cached_property
def page_range(self):
"""This is not writable in Django1.8. We want to make it writable"""
return super(CachedPaginator, self).page_range
def page(self, number):
"""
Returns a Page object for the given 1-based page number.
This will attempt to pull the results out of the cache first, based on
the requested page number. If not found in the cache,
it will pull a fresh list and then cache that result + the total result count.
"""
if self.cache_key is None:
return super(CachedPaginator, self).page(number)
# In order to prevent counting the queryset
# we only validate that the provided number is integer
# The rest of the validation will happen when we fetch fresh data.
# so if the number is invalid, no cache will be setted
# number = self.validate_number(number)
try:
number = int(number)
except (TypeError, ValueError):
raise PageNotAnInteger('That page number is not an integer')
page_cache_key = "%s:%s:%s" % (self.cache_key, self.per_page, number)
page_data = cache.get(page_cache_key)
if page_data is None:
page = super(CachedPaginator, self).page(number)
#cache not only the objects, but the total count too.
page_data = (page.object_list, self.count)
cache.set(page_cache_key, page_data, self.cache_timeout)
else:
cached_object_list, cached_total_count = page_data
self.set_count(cached_total_count)
page = Page(cached_object_list, number, self)
return page
I wanted to paginate my infinite scrolling view on my home page and this is the solution I came up with. It's a mix of Django CCBVs and the author's initial solution.
The response times, however, didn't improve as much as I would've hoped for but that's probably because I am testing it on my local with just 6 posts and 2 users haha.
# Import
from django.core.cache import cache
from django.core.paginator import InvalidPage
from django.views.generic.list import ListView
from django.http Http404
class MyListView(ListView):
template_name = 'MY TEMPLATE NAME'
model = MY POST MODEL
paginate_by = 10
def paginate_queryset(self, queryset, page_size):
"""Paginate the queryset"""
paginator = self.get_paginator(
queryset, page_size, orphans=self.get_paginate_orphans(),
allow_empty_first_page=self.get_allow_empty())
page_kwarg = self.page_kwarg
page = self.kwargs.get(page_kwarg) or self.request.GET.get(page_kwarg) or 1
try:
page_number = int(page)
except ValueError:
if page == 'last':
page_number = paginator.num_pages
else:
raise Http404(_("Page is not 'last', nor can it be converted to an int."))
try:
page = paginator.page(page_number)
cache_key = 'mylistview-%s-%s' % (page_number, page_size)
retreive_cache = cache.get(cache_key)
if retreive_cache is None:
print('re-caching')
retreive_cache = super(MyListView, self).paginate_queryset(queryset, page_size)
# Caching for 1 day
cache.set(cache_key, retreive_cache, 86400)
return retreive_cache
except InvalidPage as e:
raise Http404(_('Invalid page (%(page_number)s): %(message)s') % {
'page_number': page_number,
'message': str(e)
})