Blocking folders inbetween allowed content

自闭症网瘾萝莉.ら 提交于 2019-12-13 01:12:47

问题


I have a site with the following structure:

http://www.example.com/folder1/folder2/folder3

I would like to disallow indexing in folder1, and folder2. But I would like the robots to index everything under folder3.

Is there a way to do this with the robots.txt?

For what I read I think that everything inside a specified folder is disallowed.

Would the following achieve my goal?

user-agent: *
Crawl-delay: 0

Sitemap: <Sitemap url>

Allow: /folder1/folder2/folder3
Disallow: /folder1/folder2/
Disallow: /folder1/
Allow: /

回答1:


Yes, it works... however google has a tool to test your robots.txt file

you only need to go on google webmaster tools (https://www.google.com/webmasters/tools/)

and open the section "site configuration -> crawler access"




回答2:


All you would need is:

user-agent: *
Crawl-delay: 0

Sitemap: 

Allow: /folder1/folder2/folder3
Disallow: /folder1/
Allow: /

At least googlebot will see the more specific allowing of that one directory and disallow anything from folder1 and on. This is backed up by this post by a Google employee.




回答3:


Line breaks in records are not allowed, so your original robots.txt should look like this:

user-agent: *
Crawl-delay: 0
Sitemap: <Sitemap url>
Allow: /folder1/folder2/folder3
Disallow: /folder1/folder2/
Disallow: /folder1/
Allow: /

Possible improvements:

  • Specifying Allow: / is superfluous, as it’s the default anyway.

  • Specifying Disallow: /folder1/folder2/ is superfluous, as Disallow: /folder1/ is sufficient.

  • As Sitemap is not per record, but for all bots, you could specify it as a separate block.

So your robots.txt could look like this:

User-agent: *
Crawl-delay: 0
Allow: /folder1/folder2/folder3
Disallow: /folder1/

Sitemap: http://example.com/sitemap

(Note that the Allow field is not part of the original robots.txt specification, so don’t expect all bots to understand it.)



来源:https://stackoverflow.com/questions/5998434/blocking-folders-inbetween-allowed-content

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!