Robots.txt: allow only major SE

前端 未结 4 938
不思量自难忘°
不思量自难忘° 2020-12-31 01:11

Is there a way to configure the robots.txt so that the site accepts visits ONLY from Google, Yahoo! and MSN spiders?

4条回答
  •  别那么骄傲
    2020-12-31 02:05

    There are more than 3 major search engines depending on which country you are talking. Facebook seem to be doing a good job listing only legitimate ones: https://facebook.com/robots.txt

    So your robots.txt can be something like:

    User-agent: Applebot
    Allow: /
    
    User-agent: baiduspider
    Allow: /
    
    User-agent: Bingbot
    Allow: /
    
    User-agent: Facebot
    Allow: /
    
    User-agent: Googlebot
    Allow: /
    
    User-agent: msnbot
    Allow: /
    
    User-agent: Naverbot
    Allow: /
    
    User-agent: seznambot
    Allow: /
    
    User-agent: Slurp
    Allow: /
    
    User-agent: teoma
    Allow: /
    
    User-agent: Twitterbot
    Allow: /
    
    User-agent: Yandex
    Allow: /
    
    User-agent: Yeti
    Allow: /
    
    User-agent: *
    Disallow: /
    

提交回复
热议问题