Handling Redirection w/ PhantomJS + Selenium

落爺英雄遲暮 提交于 2019-11-30 23:23:40

It turns out the page couldn't be crawled due an error: SSL handshake failed.

The solution is to use the following line to initialize the driver:

driver = webdriver.PhantomJS(executable_path="./phantomjs", service_args=['--ignore-ssl-errors=true'])

I have used the below settings:

DesiredCapabilities capabilities;
capabilities = new DesiredCapabilities();       
capabilities.setJavascriptEnabled(true);
capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "drivers/phantomjs.exe");
capabilities.setCapability(PhantomJSDriverService.PHANTOMJS_PAGE_SETTINGS_PREFIX,"Y");
capabilities.setCapability("phantomjs.page.settings.userAgent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:16.0) Gecko/20121026 Firefox/16.0");

//intialize driver and set capabilties

driver = new PhantomJSDriver(capabilities);

Then, I did the executed the following two lines and they worked fine for me

driver.get("https://login.vrealizeair.vmware.com/");
System.out.println(driver.getCurrentUrl());
System.out.println(driver.getPageSource());

Here's the output:

https://login.vrealizeair.vmware.com/sso/UI/Login
<!-- [RESPONSE_PAGE_TYPE=3DLOGIN] --><!DOCTYPE html><html><head>
    <meta charset="utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <title>Login | vRealize™ Air™</title>
    <link rel="stylesheet" href="/sso/css/styles.css?v=3" type="text/css">
    <link rel="shortcut icon" href="/sso/images/vmwareFavicon.ico" type="image/x-icon">

    <script async="" src="//rum-static.pingdom.net/prum.min.js"></script><script>...........................................
.....................................................
...................................................//Entire page source was displayed

I tried out the following code in python and it seems to be working fine:

from selenium import webdriver

driver = webdriver.PhantomJS("./phantomjs") 

driver.get("https://login.vrealizeair.vmware.com/")
print 'done'
print driver.current_url
print driver.page_source

Output (working fine):

https://login.vrealizeair.vmware.com/sso/UI/Login
<!-- [RESPONSE_PAGE_TYPE=3DLOGIN] --><!DOCTYPE html><html><head>
        <meta charset="utf-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <title>Login | vRealize™ Air™</title>
        <link rel="stylesheet" href="/sso/css/styles.css?v=3" type="text/css">

Imp note: Start navigating from the base page. The html code is empty because the website is probably throwing a 403 error. If the login URL is not working for you, try navigating from the pages that appear before the login page.

This solution really worked for me, I was getting below error in the phantomjsdriver.log and on attempting to login, phantomjs was loggin out.

[DEBUG - 2017-08-19T20:37:59.288Z] Session [47739640-851e-11e7-9326-9bef0ad085f5] - page.onResourceError - {"errorCode":299,"errorString":"Error transferring https://int-test-cc.gcsip.nl:4443/rest/user/keepAlive?cacheBuster=1503175078533 - server replied: Unsupported Media Type","id":9,"status":415,"statusText":"Unsupported Media Type","url":"IPAdd:port/rest/user/keepAlive?cacheBuster=1503175078533"}

after adding below capabilities to phantomjs it worked -

caps.setJavascriptEnabled(true)
caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY, "phantomjs")
caps.setCapability(PhantomJSDriverService.PHANTOMJS_PAGE_SETTINGS_PREFIX,"Y");
caps.setCapability("phantomjs.page.settings.userAgent","Mozilla/5.0 (X11; Linux x86_64; rv:46.0) Gecko/20100101 Firefox/46.0")//"Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/602.1 (KHTML, like Gecko) PhantomJS/2.5.0-development Version/9.0 Safari/602.1")
caps.setCapability(PhantomJSDriverService.PHANTOMJS_PAGE_CUSTOMHEADERS_PREFIX + "Content-Type","application/json;charset=utf-8")
caps.setCapability(PhantomJSDriverService.PHANTOMJS_PAGE_CUSTOMHEADERS_PREFIX + "Connection","Keep-Alive")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!