问题
I intend to create an Android application that performs a headless login to a website and then scrape some content from the subsequent page while maintaining the logged-in session.
I first used HtmlUnit in a normal Java project and it worked just fine. But later found that HtmlUnit is not compatible with Android.
Then I tried JSoup library by sending HTTP “POST” request to the login form. But the resulting page does not load up completely since JSoup won't support JavaScript.
I was then suggested to have a look on Selendroid which actually is an android test automation framework. But what I actually need is an Html parser that supports both JavaScript and Android. I find Selendroid quite difficult to understand which I can't even figure out which dependencies to use.
- selendroid-client
- selendroid-standalone
- selendroid-server
With Selenium WebDriver, the code would be as simple as the following. But can somebody show me a similar code example for Selendroid as well?
WebDriver driver = new FirefoxDriver();
driver.get("https://mail.google.com/");
driver.findElement(By.id("email")).sendKeys(myEmail);
driver.findElement(By.id("pass")).sendKeys(pass);
// Click on 'Sign In' button
driver.findElement(By.id("signIn")).click();
And also,
- What dependencies to add to my Gradle.Build file?
- Which Selendroid libraries to import?
回答1:
Unfortunately I didn't get Selendroid to work. But I find a workaround to scrape dynamic content by using just Android's built in WebView with JavaScript enabled.
mWebView = new WebView();
mWebView.getSettings().setJavaScriptEnabled(true);
mWebView.addJavascriptInterface(new HtmlHandler(), "HtmlHandler");
mWebView.setWebViewClient(new WebViewClient() {
@Override
public void onPageFinished(WebView view, String url) {
super.onPageFinished(view, url);
if (url == urlToLoad) {
// Pass html source to the HtmlHandler
WebView.loadUrl("javascript:HtmlHandler.handleHtml(document.documentElement.outerHTML);");
}
});
The JS method document.documentElement.outerHTML
will retrieve the full html contained in the loaded url. Then the retrived html string is sent to handleHtml method in HtmlHandler class.
class HtmlHandler {
@JavascriptInterface
@SuppressWarnings("unused")
public void handleHtml(String html) {
// scrape the content here
}
}
You may use a library like Jsoup to scrape the necessary content from the html String.
回答2:
I never had used Selendroid
so I'm not really sure about that but searching by the net I found this example and, according to it, I suppose that your code translation from Selenium
to Selendroid
would be:
Translation code (in my opinion)
public class MobileWebTest {
private SelendroidLauncher selendroidServer = null;
private WebDriver driver = null;
@Test
public void doTest() {
driver.get("https://mail.google.com/");
WebElement email = driver.findElement(By.id("email")).sendKeys(myEmail);
WebElement password = driver.findElement(By.id("pass")).sendKeys(pass);
WebElement button = driver.findElement(By.id("signIn")).click();
driver.quit();
}
@Before
public void startSelendroidServer() throws Exception {
if (selendroidServer != null) {
selendroidServer.stopSelendroid();
}
SelendroidConfiguration config = new SelendroidConfiguration();
selendroidServer = new SelendroidLauncher(config);
selendroidServer.launchSelendroid();
DesiredCapabilities caps = SelendroidCapabilities.android();
driver = new SelendroidDriver(caps);
}
@After
public void stopSelendroidServer() {
if (driver != null) {
driver.quit();
}
if (selendroidServer != null) {
selendroidServer.stopSelendroid();
}
}
}
What do you have to add to your project
It seems that you have to add to your project the Selendroid standalone jar file
. If you have doubts about how to add a external jar in an Android project you can see this question: How can I use external JARs in an Android project?
Here you can download the jar file
: jar file
Also, it seems that it is not enough just to add the jar file
to your project. You should add too the selendroid-client jar file
of the version of standalone that you have.
You can download it from here: client jar file
I expect it will be helpful for you!
回答3:
I would suggest you use WebdriverIO since you want to use Javascript. It uses NodeJs so it will be easy to require other plugins to scrape the HTML.
Appium is also an alternative but it's more focused on front-end testing.
来源:https://stackoverflow.com/questions/30058692/selendroid-as-a-web-scraper