问题
I used
Disallow: /*?
in the robots.txt file to disallow all pages that might contain a "?" in the URL.
Is that syntax correct, or am I blocking other pages as well?
回答1:
It depends on the bot.
Bots that follow the original robots.txt specification don’t give the *
any special meaning. These bots would block any URL whose path starts with /*
, directly followed by ?
, e.g., http://example.com/*?foo
.
Some bots, including the Googlebot, give the *
character a special meaning. It typically stands for any sequence of characters. These bots would block what you seem to intend: any URL with a ?
.
Google’s robots.txt documentation includes this very case:
To block access to all URLs that include question marks (
?
). For example, the sample code blocks URLs that begin with your domain name, followed by any string, followed by a question mark, and ending with any string:User-agent: Googlebot Disallow: /*?
来源:https://stackoverflow.com/questions/41140542/using-disallow-in-robots-txt-file