Our URL is:
http://example.com/kitchen-knife/collection/maitre-universal-cutting-boards-rana-parsley-chopper-cheese-slicer-vegetables-knife-sharpening-stone-ham
This is not possible in the original robots.txt specification.
But some (!) parsers extend the specification and define a wildcard character (typically *
).
For those parsers, you could use:
Disallow: /*/collection
Parsers that understand *
as wildcard will stop crawling any URL whose path starts with anything (which may be nothing), followed by /collection/
, followed by anything, e.g.,
http://example.com/foo/collection/
http://example.com/foo/collection/bar
http://example.com/collection/
Parsers that don’t understand *
as wildcard (i.e., they follow the original specification) will stop crawling any URL whose paths starts with /*/collection/
, e.g.
http://example.com/*/collection/
http://example.com/*/collection/bar