How to disallow service api and multilingual urls in robots.txt

随声附和 提交于 2019-12-13 16:42:25

问题


I need to disallow the next URLs:

  1. service api /_s/user, /_s/place, ... All starts with /_s/
  2. save form: /{language}/save. For example /{en}/save, /{ru}/save, ...

NOTE: most URLs have language parameter at the beginning: /en/event, ... I don't want to block them.

Should be something like: (but this is not allowed by robots.txt format)

Disallow: /_s/*
Disallow: /:lang/save

回答1:


In robots.txt matching is from the left, so it matches anything that begins with /pattern.

The wildcard like /*pattern matches any beginning which must be followed by the given pattern. Therefore * is never needed on the right (e.g. /foo* as it is equivalent to /foo).

So in your case you can use

Disallow: /_s/

  • to disallow anything which starts with /_s/ e.g. /_s/foo

Disallow: /*save

  • to disallow all patterns such as /en/save but also /foosave or /en/save/other

You can use $ to signify "must end with"

Disallow: /*save$

  • to disallow all patterns such as /en/save or /fr/save but not /en/save/other

You can find a bit more on robots.txt in Robots.txt : 4 Things You Should Know article

I hope that will help.



来源:https://stackoverflow.com/questions/14608160/how-to-disallow-service-api-and-multilingual-urls-in-robots-txt

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!