I have a site with about 150K pages in its sitemap. I\'m using the sitemap index generator to make the sitemaps, but really, I need a way of caching it, because building the 150
I'm using django-staticgenerator app for caching sitemap.xml to filesystem and update that file when data updated.
settings.py:
STATIC_GENERATOR_URLS = (
r'^/sitemap',
)
WEB_ROOT = os.path.join(SITE_ROOT, 'cache')
models.py:
from staticgenerator import quick_publish, quick_delete
from django.dispatch import receiver
from django.db.models.signals import post_save, post_delete
from django.contrib.sitemaps import ping_google
@receiver(post_delete)
@receiver(post_save)
def delete_cache(sender, **kwargs):
# Check if a Page model changed
if sender == Page:
quick_delete('/sitemap.xml')
# You may republish sitemap file now
# quick_publish('/', '/sitemap.xml')
ping_google()
In nginx configuration I redirect sitemap.xml to cache folder and django instance for fallback:
location /sitemap.xml {
root /var/www/django_project/cache;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header Host $http_host;
if (-f $request_filename/index.html) {
rewrite (.*) $1/index.html break;
}
# If file doesn't exist redirect to django
if (!-f $request_filename) {
proxy_pass http://127.0.0.1:8000;
break;
}
}
With this method, sitemap.xml will always be updated and clients(like google) gets xml file always staticly. That's cool I think! :)