问题
I have a site with the following robots.txt in the root:
User-agent: *
Disabled: /
User-agent: Googlebot
Disabled: /
User-agent: Googlebot-Image
Disallow: /
And pages within this site are getting scanned by Googlebots all day long. Is there something wrong with my file or with Google?
回答1:
It should be Disallow:
, not Disabled:
.
回答2:
Maybe give the Google robots.txt checker a try
回答3:
Google have an analysis tool for checking robots.txt entries, read about it here
You might also want to check the IP addresses of the "rogue" robots to see if they really are owned by Google.
回答4:
Also I believe that the bot goes down the page and takes the first directive that applies to it. In your case, Googlebot and Googlebot-Image would never see their specific directives because they would respect the "User-Agent: *" first.
Disregard this answer. I found information that points to this not being the case. The bot should find the directive specific to it and respect it
来源:https://stackoverflow.com/questions/344697/googlebots-ignoring-robots-txt