What are recommended directives for robots.txt in a Django application?

ぐ巨炮叔叔 提交于 2019-12-12 23:15:41

问题


Currently my django project has following structure.

./
../ 
app1/
app2/
django_project
manage.py
media
static
secret_stuff

and my robots.txt looks something like this:

User-agent: *
Allow: /
Sitemap: mysite.com/sitemaps.xml

I want to know following things:

  1. What are the recommend directives should i add to my robots.txt file, as django documentation is saying nothing on this topic.

  2. How do i stop bots from reaching (indexing) contents of secret_stuff and mysite.com/admin/ directory ?

      Disallow: /secret_stuff      (Is that okay ?)
      Disallow: /admin            (Is that okay ?)
    

回答1:


Robots directives are not related to Django framework that is why you won't find any information about it in Django docs. Normally, it is up to you what to allow and what to disallow for searching on you website.

There are different ways to include robots.txt into the Django project. I am personally using django-robots app which simplifies the way you can embed robots.txt into you project.

It is not necessary to use it in every project. If you find it simpler you can just render txt file by yourself.

My simplified robots.txt for Django project looks like:

User-agent: *
Disallow: /*.pdf
Disallow: /*.ppt
Disallow: /*.doc
Disallow: /*.xls
Disallow: /*.txt

User-agent: Yandex
Allow: /events
Allow: /contests
Allow: /schools
Disallow: /admin
Crawl-delay: 3

User-agent: Googlebot
Allow: /events
Allow: /contests
Allow: /schools
Disallow: /admin
Crawl-delay: 3

Host: https://mysite.ru
Sitemap: https://mysite.ru/sitemap.xml


来源:https://stackoverflow.com/questions/42594231/what-are-recommended-directives-for-robots-txt-in-a-django-application

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!