问题
I need to disallow the next URLs:
- service api
/_s/user
,/_s/place
, ... All starts with/_s/
- save form:
/{language}/save
. For example/{en}/save
,/{ru}/save
, ...
NOTE: most URLs have language parameter at the beginning: /en/event
, ... I don't want to block them.
Should be something like: (but this is not allowed by robots.txt
format)
Disallow: /_s/*
Disallow: /:lang/save
回答1:
In robots.txt
matching is from the left, so it matches anything that begins with /pattern
.
The wildcard like /*pattern
matches any beginning which must be followed by the given pattern
. Therefore *
is never needed on the right (e.g. /foo*
as it is equivalent to /foo
).
So in your case you can use
Disallow: /_s/
- to disallow anything which starts with
/_s/
e.g./_s/foo
Disallow: /*save
- to disallow all patterns such as
/en/save
but also/foosave
or/en/save/other
You can use $
to signify "must end with"
Disallow: /*save$
- to disallow all patterns such as
/en/save
or/fr/save
but not/en/save/other
You can find a bit more on robots.txt in Robots.txt : 4 Things You Should Know article
I hope that will help.
来源:https://stackoverflow.com/questions/14608160/how-to-disallow-service-api-and-multilingual-urls-in-robots-txt