How to programmatically access web page in java

后端 未结 5 1437
无人及你
无人及你 2021-01-30 18:17

There is a web page from which I want to retrieve a certain string. In order to do so, I need to login, click some buttons, fill a text box, click another button - and then the

相关标签:
5条回答
  • 2021-01-30 18:43

    The super simple way to do this is using HtmlUnit here:

    http://htmlunit.sourceforge.net/

    and what you want to do can be as simple as:

    @Test
    public void homePage() throws Exception {
        final WebClient webClient = new WebClient();
        final HtmlPage page = webClient.getPage("http://htmlunit.sourceforge.net");
        assertEquals("HtmlUnit - Welcome to HtmlUnit", page.getTitleText());
    }
    
    0 讨论(0)
  • 2021-01-30 18:43

    Well when you press a button usually you do a request via a HTTP POST method, so you should use HttpClient to handle request and HtmlParser to handle the response page with the string you need.

    0 讨论(0)
  • 2021-01-30 18:47

    Take a look at the apache HttpClient project, or if you need to run Javascript on the page, try HttpUnit.

    0 讨论(0)
  • 2021-01-30 19:03

    Try HtmlUnit

    HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.

    Example code for submiting form:

    @Test
    public void submittingForm() throws Exception {
        final WebClient webClient = new WebClient();
    
        // Get the first page
        final HtmlPage page1 = webClient.getPage("http://some_url");
    
        // Get the form that we are dealing with and within that form, 
        // find the submit button and the field that we want to change.
        final HtmlForm form = page1.getFormByName("myform");
    
        final HtmlSubmitInput button = form.getInputByName("submitbutton");
        final HtmlTextInput textField = form.getInputByName("userid");
    
        // Change the value of the text field
        textField.setValueAttribute("root");
    
        // Now submit the form by clicking the button and get back the second page.
        final HtmlPage page2 = button.click();
    
        webClient.closeAllWindows();
    }
    

    For more details check: http://htmlunit.sourceforge.net/gettingStarted.html

    0 讨论(0)
  • 2021-01-30 19:07

    Yes:

    • java.net.URL#openConnection() will allow you to make http requests and get the http responses

    • Apache HttpComponents is a library that makes it easier to work with HTTP.

    0 讨论(0)
提交回复
热议问题