How to efficiently serve massive sitemaps in django

后端 未结 4 1767
傲寒
傲寒 2021-02-01 22:37

I have a site with about 150K pages in its sitemap. I\'m using the sitemap index generator to make the sitemaps, but really, I need a way of caching it, because building the 150

4条回答
  •  孤独总比滥情好
    2021-02-01 23:04

    I'm using django-staticgenerator app for caching sitemap.xml to filesystem and update that file when data updated.

    settings.py:

    STATIC_GENERATOR_URLS = (
        r'^/sitemap',
    )
    WEB_ROOT = os.path.join(SITE_ROOT, 'cache')
    

    models.py:

    from staticgenerator import quick_publish, quick_delete
    from django.dispatch import receiver
    from django.db.models.signals import post_save, post_delete
    from django.contrib.sitemaps import ping_google
    
    @receiver(post_delete)
    @receiver(post_save)
    def delete_cache(sender, **kwargs):
        # Check if a Page model changed
        if sender == Page:
            quick_delete('/sitemap.xml')
            # You may republish sitemap file now
            # quick_publish('/', '/sitemap.xml')
            ping_google()
    

    In nginx configuration I redirect sitemap.xml to cache folder and django instance for fallback:

    location /sitemap.xml {
        root /var/www/django_project/cache;
    
        proxy_set_header  X-Real-IP  $remote_addr;
        proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
    
        if (-f $request_filename/index.html) {
            rewrite (.*) $1/index.html break;
        }
        # If file doesn't exist redirect to django
        if (!-f $request_filename) {
            proxy_pass http://127.0.0.1:8000;
            break;
        }    
    }
    

    With this method, sitemap.xml will always be updated and clients(like google) gets xml file always staticly. That's cool I think! :)

提交回复
热议问题