Ruby on Rails, How to determine if a request was made by a robot or search engine spider?

后端 未结 4 1697
有刺的猬
有刺的猬 2020-12-13 21:24

I\'ve Rails apps, that record an IP-address from every request to specific URL, but in my IP database i\'ve found facebook blok IP like 66.220.15.* and Google IP (i suggest

相关标签:
4条回答
  • 2020-12-13 21:30

    Another way is to use crawler_detect gem:

    CrawlerDetect.is_crawler?("Bot user agent")
    => true
    
    #or after adding Rack::Request extension
    request.is_crawler?
    => true
    

    It can be useful if you want to detect a large various of different bots (more than 1000).

    0 讨论(0)
  • 2020-12-13 21:35

    Since the well behaved bots at least typically include a reference URI in the UA string they send, something like:

    request.env["HTTP_USER_AGENT"].match(/\(.*https?:\/\/.*\)/)
    

    is an easy way to see if the request is from a bot vs. a human user's agent. This seems to be more robust than trying to match against a comprehensive list.

    0 讨论(0)
  • 2020-12-13 21:46

    I think you can use browser gem for check bots.

    if browser.bot?
      # code here
    end
    

    https://github.com/fnando/browser

    0 讨论(0)
  • 2020-12-13 21:57

    Robots are required (by common sense / courtesy more than any kind of law) to send along a User-Agent with their request. You can check for this using request.env["HTTP_USER_AGENT"] and filter as you please.

    0 讨论(0)
提交回复
热议问题