Can a website detect when you are using selenium with chromedriver?

后端 未结 19 2684
情歌与酒
情歌与酒 2020-11-21 05:41

I\'ve been testing out Selenium with Chromedriver and I noticed that some pages can detect that you\'re using Selenium even though there\'s no automation at all. Even when I

19条回答
  •  孤独总比滥情好
    2020-11-21 06:11

    As we've already figured out in the question and the posted answers, there is an anti Web-scraping and a Bot detection service called "Distil Networks" in play here. And, according to the company CEO's interview:

    Even though they can create new bots, we figured out a way to identify Selenium the a tool they’re using, so we’re blocking Selenium no matter how many times they iterate on that bot. We’re doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious.

    It'll take time and additional challenges to understand how exactly they are detecting Selenium, but what can we say for sure at the moment:

    • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped
    • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped
    • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium webdriver

    Decided to post it as an answer, since clearly:

    Can a website detect when you are using selenium with chromedriver?

    Yes.


    Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, may be, this could give us more information on where to look and what is it they use to detect a webdriver-powered browser. It's just a theory that needs to be tested.

提交回复
热议问题