Robots.txt Allow sub folder but not the parent

前端 未结 3 786
南旧
南旧 2021-01-03 20:54

Can anybody please explain the correct robots.txt command for the following scenario.

I would like to allow access to:

/directory/subdirec

相关标签:
3条回答
  • 2021-01-03 21:29

    I would recommend using Google's robot tester. Utilize Google Webmaster tools - https://support.google.com/webmasters/answer/6062598?hl=en

    You can edit and test URLs right in the tool, plus you get a wealth of other tools as well.

    0 讨论(0)
  • 2021-01-03 21:44

    Be aware that there is no real official standard and that any web crawler may happily ignore your robots.txt

    According to a Google groups post, the following works at least with GoogleBot;

    User-agent: Googlebot 
    Disallow: /directory/ 
    Allow: /directory/subdirectory/
    
    0 讨论(0)
  • If these are truly directories then the accepted answer is probably your best choice. But, if you're writing an application and the directories are dynamically generated paths (a.k.a. contexts, routes, etc), then you might want to use meta tags instead of defining it in the robots.txt. This gives you the advantage of not having to worry about how different browsers may interpret/prioritize the access to the subdirectory path.

    You might try something like this in the code:

    if is_parent_directory_path
       <meta name="robots" content="noindex, nofollow">
    end
    
    0 讨论(0)
提交回复
热议问题