The question:
Is is possible to tell a browser that is controlled by selenium webdriver to not load any content from external sources, or alternatively, not
Solution is to use proxy. Webdriver integrates very well with browsermob proxy: http://bmp.lightbody.net/
private WebDriver initializeDriver() throws Exception {
// Start the server and get the selenium proxy object
ProxyServer server = new ProxyServer(proxy_port); // package net.lightbody.bmp.proxy
server.start();
server.setCaptureHeaders(true);
// Blacklist google analytics
server.blacklistRequests("https?://.*\\.google-analytics\\.com/.*", 410);
// Or whitelist what you need
server.whitelistRequests("https?://*.*.yoursite.com/.*. https://*.*.someOtherYourSite.*".split(","), 200);
Proxy proxy = server.seleniumProxy(); // Proxy is package org.openqa.selenium.Proxy
// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(CapabilityType.PROXY, proxy);
// start the driver ;
Webdriver driver = new FirefoxDriver(capabilities);
return driver;
}
EDIT: people are often asking for http status codes, you can easily retrive them using the proxy. Code can be something like this:
// create a new har with given label
public void setHar(String label) {
server.newHar(label);
}
public void getHar() throws IOException {
// FIXME : What should be done with the this data?
Har har = server.getHar();
if (har == null) return;
File harFile = new File("C:\\localdev\\bla.har");
har.writeTo(harFile);
for (HarEntry entry : har.getLog().getEntries()) {
// Check for any 4XX and 5XX HTTP status codes
if ((String.valueOf(entry.getResponse().getStatus()).startsWith("4"))
|| (String.valueOf(entry.getResponse().getStatus()).startsWith("5"))) {
log.warn(String.format("%s %d %s", entry.getRequest().getUrl(), entry.getResponse().getStatus(),
entry.getResponse().getStatusText()));
//throw new UnsupportedOperationException("Not implemented");
}
}
}
You can chain the proxy, there isn't much documentation out there about doing so:
http://www.nerdnuts.com/2014/10/browsermob-behind-a-corporate-proxy/
We were able to use browsermob behind a corporate proxy using the following code:
// start the proxy
server = new ProxyServer(9090);
server.start();
server.setCaptureContent(true);
server.setCaptureHeaders(true);
server.addHeader(“accept-encoding”, “”);//turn off gzip
// Configure proxy server to use our network proxy
server.setLocalHost(InetAddress.getByName(“127.0.0.1″));
/**
* THIS IS THE MAJICK!
**/
HashMap<String, String> options = new HashMap<String, String>();
options.put(“httpProxy”, “172.20.4.115:8080″);
server.setOptions(options);
server.autoBasicAuthorization(“172.20.4.115″, “username”, “password”);
// get the Selenium proxy object
Proxy proxy = server.seleniumProxy();
DesiredCapabilities capabilities = DesiredCapabilities.phantomjs();
capabilities.setCapability(CapabilityType.PROXY, proxy);