Selenium webdriver: Modifying navigator.webdriver flag to prevent selenium detection

こ雲淡風輕ζ 提交于 2020-01-08 14:14:08

问题


I'm trying to automate a very basic task in a website using selenium and chrome but somehow the website detects when chrome is driven by selenium and blocks every request. I suspect that the website is relying on an exposed DOM variable like this one https://stackoverflow.com/a/41904453/648236 to detect selenium driven browser.

My question is, is there a way I can make the navigator.webdriver flag false? I am willing to go so far as to try and recompile the selenium source after making modifications, but I cannot seem to find the NavigatorAutomationInformation source anywhere in the repository https://github.com/SeleniumHQ/selenium

Any help is much appreciated

P.S: I also tried the following from https://w3c.github.io/webdriver/#interface

Object.defineProperty(navigator, 'webdriver', {
    get: () => false,
  });

But it only updates the property after the initial page load. I think the site detects the variable before my script is executed.


回答1:


You saw it right. The answer you have referred pointed to the W3C Editor's Draft state of 2017 which have evolved in the last two years. The current implementation strictly speaks that:

The webdriver-active flag is set to true when the user agent is under remote control which is initially set to false.

Further,

Navigator includes NavigatorAutomationInformation;

It is to be noted that:

The NavigatorAutomationInformation interface should not be exposed on WorkerNavigator.

The NavigatorAutomationInformation interface is defined as:

interface mixin NavigatorAutomationInformation {
    readonly attribute boolean webdriver;
};

which returns true if webdriver-active flag is set, false otherwise.

Finally, the navigator.webdriver defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, so that alternate code paths can be triggered during automation.

Altering any of these parameters may block the navigation and get the WebDriver instance detected.


Update (6-Nov-2019)

As of the current implementation an ideal way to access a web page without getting detected would be to use the ChromeOptions() class to add the following arguments:

  • excludeSwitches
  • start-maximized

Java Example

public class A_Chrome 
{
    public static void main(String[] args) 
    {
        System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
        ChromeOptions options = new ChromeOptions();
        options.addArguments("start-maximized");
        options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
        WebDriver driver =  new ChromeDriver(options);
        driver.get("https://www.google.co.in");
        System.out.println(driver.getTitle());
        driver.quit();
    }
}
  • Console Output:

    Starting ChromeDriver 78.0.3904.70 (edb9c9f3de0247fd912a77b7f6cae7447f6d3ad5-refs/branch-heads/3904@{#800}) on port 24667
    Only local connections are allowed.
    Please protect ports used by ChromeDriver and related test frameworks to prevent access by malicious code.
    Nov 06, 2019 8:23:59 PM org.openqa.selenium.remote.ProtocolHandshake createSession
    INFO: Detected dialect: W3C
    Google
    

Python Example

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://publicindex.sccourts.org/horry/publicindex/")
WebDriverWait(driver, 10).until(EC.title_contains("Index"))
print(driver.current_url)
driver.quit()
  • Console Output:

    https://publicindex.sccourts.org/horry/publicindex/
    



回答2:


Before (in browser console window):

> navigator.webdriver
true

Change (in selenium):

// C#
var options = new ChromeOptions();
options.AddExcludedArguments(new List<string>() { "enable-automation" });

// Python
options.add_experimental_option("excludeSwitches", ['enable-automation'])

After (in browser console window):

> navigator.webdriver
undefined

This will not work for version ChromeDriver 79.0.3945.16 and above. See the release notes here




回答3:


Nowadays you can accomplish this with cdp command:

driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
  "source": """
    Object.defineProperty(navigator, 'webdriver', {
      get: () => undefined
    })
  """
})

driver.get(some_url)

by the way, you want to return undefined, false is a dead giveaway.




回答4:


Try to change the user agent

Something like that:

ChromeOptions options = new ChromeOptions();
options.addArguments("user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36");

ChromeDriver driver = new ChromeDriver(options);


来源:https://stackoverflow.com/questions/53039551/selenium-webdriver-modifying-navigator-webdriver-flag-to-prevent-selenium-detec

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!