问题
I'm trying to automate a very basic task in a website using selenium and chrome but somehow the website detects when chrome is driven by selenium and blocks every request. I suspect that the website is relying on an exposed DOM variable like this one https://stackoverflow.com/a/41904453/648236 to detect selenium driven browser.
My question is, is there a way I can make the navigator.webdriver flag false? I am willing to go so far as to try and recompile the selenium source after making modifications, but I cannot seem to find the NavigatorAutomationInformation source anywhere in the repository https://github.com/SeleniumHQ/selenium
Any help is much appreciated
P.S: I also tried the following from https://w3c.github.io/webdriver/#interface
Object.defineProperty(navigator, 'webdriver', {
get: () => false,
});
But it only updates the property after the initial page load. I think the site detects the variable before my script is executed.
回答1:
You saw it right. The answer you have referred pointed to the W3C Editor's Draft state of 2017 which have evolved in the last two years. The current implementation strictly speaks that:
The
webdriver-active
flag is set totrue
when the user agent is under remote control which is initially set tofalse
.
Further,
Navigator includes NavigatorAutomationInformation;
It is to be noted that:
The
NavigatorAutomationInformation
interface should not be exposed on WorkerNavigator.
The NavigatorAutomationInformation
interface is defined as:
interface mixin NavigatorAutomationInformation {
readonly attribute boolean webdriver;
};
which returns true if webdriver-active
flag is set, false otherwise.
Finally, the navigator.webdriver
defines a standard way for co-operating user agents to inform the document that it is controlled by WebDriver, so that alternate code paths can be triggered during automation.
Altering any of these parameters may block the navigation and get the WebDriver instance detected.
Update (6-Nov-2019)
As of the current implementation an ideal way to access a web page without getting detected would be to use the ChromeOptions()
class to add the following arguments:
excludeSwitches
start-maximized
Java Example
public class A_Chrome
{
public static void main(String[] args)
{
System.setProperty("webdriver.chrome.driver", "C:\\Utility\\BrowserDrivers\\chromedriver.exe");
ChromeOptions options = new ChromeOptions();
options.addArguments("start-maximized");
options.setExperimentalOption("excludeSwitches", Collections.singletonList("enable-automation"));
WebDriver driver = new ChromeDriver(options);
driver.get("https://www.google.co.in");
System.out.println(driver.getTitle());
driver.quit();
}
}
Console Output:
Starting ChromeDriver 78.0.3904.70 (edb9c9f3de0247fd912a77b7f6cae7447f6d3ad5-refs/branch-heads/3904@{#800}) on port 24667 Only local connections are allowed. Please protect ports used by ChromeDriver and related test frameworks to prevent access by malicious code. Nov 06, 2019 8:23:59 PM org.openqa.selenium.remote.ProtocolHandshake createSession INFO: Detected dialect: W3C Google
Python Example
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://publicindex.sccourts.org/horry/publicindex/")
WebDriverWait(driver, 10).until(EC.title_contains("Index"))
print(driver.current_url)
driver.quit()
Console Output:
https://publicindex.sccourts.org/horry/publicindex/
回答2:
Before (in browser console window):
> navigator.webdriver
true
Change (in selenium):
// C#
var options = new ChromeOptions();
options.AddExcludedArguments(new List<string>() { "enable-automation" });
// Python
options.add_experimental_option("excludeSwitches", ['enable-automation'])
After (in browser console window):
> navigator.webdriver
undefined
This will not work for version ChromeDriver 79.0.3945.16 and above. See the release notes here
回答3:
Nowadays you can accomplish this with cdp command:
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
driver.get(some_url)
by the way, you want to return undefined
, false
is a dead giveaway.
回答4:
Try to change the user agent
Something like that:
ChromeOptions options = new ChromeOptions();
options.addArguments("user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.50 Safari/537.36");
ChromeDriver driver = new ChromeDriver(options);
来源:https://stackoverflow.com/questions/53039551/selenium-webdriver-modifying-navigator-webdriver-flag-to-prevent-selenium-detec