How to fill a form with Jsoup?

前端 未结 2 631
情话喂你
情话喂你 2020-11-30 11:51

I am trying to navigate to description page of California website http://kepler.sos.ca.gov/. but unable to go .

Then,I have a html form, on which I am submitting re

相关标签:
2条回答
  • 2020-11-30 12:22

    This is the exact same code as posted above in the accepted answer, except that it reflects the changes California made to their website after the original answer was posted. So as of my writing this, this code works. I've updated original comments, identifying any changes.

    // * Connect to website (Orignal url: http://kepler.sos.ca.gov/)
    String url = "https://businesssearch.sos.ca.gov/";
    Connection.Response resp = Jsoup.connect(url) //
                                    .timeout(30000) //
                                    .method(Connection.Method.GET) //
                                    .execute();
    
    // * Find the form (Original jsoup selector: from#aspnetForm)
    Document responseDocument = resp.parse();
    Element potentialForm = responseDocument.select("form#formSearch").first();
    checkElement("form element", potentialForm);
    FormElement form = (FormElement) potentialForm;
    
    // * Fill in the form and submit it
    // ** Search Type (Original jsoup selector: name$=RadioButtonList_SearchType)
    Element radioButtonListSearchType = form.select("name$=SearchType]").first();
    checkElement("search type radio button list", radioButtonListSearchType);
    radioButtonListSearchType.attr("checked", "checked");
    
    // ** Name search (Original jsoup selector: name$=TextBox_NameSearch)
    Element textBoxNameSearch = form.select("[name$=SearchCriteria]").first();
    checkElement("name search text box", textBoxNameSearch);
    textBoxNameSearch.val("cali");
    
    // ** Submit the form
    Document searchResults = form.submit().cookies(resp.cookies()).post();
    
    // * Extract results (entity numbers in this sample code, orignal jsoup selector: id$=SearchResults_Corp)
    for (Element entityNumber : searchResults.select("table[id$=enitityTable] > tbody > tr > td:first-of-type:not(td[colspan=5])")) {
        System.out.println(entityNumber.text());
    }
    
    0 讨论(0)
  • 2020-11-30 12:39

    You want to use FormElement. This is a useful feature of Jsoup. It is able to find the fields declared inside a form and post them for you. Before posting the form you can set the value of the fields using Jsoup API.

    Nota:

    In the sample codes below, you'll always see calls to the Element#select method followed by a call to Elements#first method.

    For example : responseDocument.select("form#aspnetForm").first()

    Jsoup 1.11.1 has introduced a more efficient alternative : Element#selectFirst. You can use it as a direct replacement of the original alternative.

    For example:
    responseDocument.select("form#aspnetForm").first()
    can be replaced by
    responseDocument.selectFirst("form#aspnetForm")

    SAMPLE CODE

    // * Connect to website
    String url = "http://kepler.sos.ca.gov/";
    Connection.Response resp = Jsoup.connect(url) //
                                    .timeout(30000) //
                                    .method(Connection.Method.GET) //
                                    .execute();
    
    // * Find the form
    Document responseDocument = resp.parse();
    Element potentialForm = responseDocument.select("form#aspnetForm").first();
    checkElement("form element", potentialForm);
    FormElement form = (FormElement) potentialForm;
    
    // * Fill in the form and submit it
    // ** Search Type
    Element radioButtonListSearchType = form.select("[name$=RadioButtonList_SearchType]").first();
    checkElement("search type radio button list", radioButtonListSearchType);
    radioButtonListSearchType.attr("checked", "checked");
    
    // ** Name search
    Element textBoxNameSearch = form.select("[name$=TextBox_NameSearch]").first();
    checkElement("name search text box", textBoxNameSearch);
    textBoxNameSearch.val("cali");
    
    // ** Submit the form
    Document searchResults = form.submit().cookies(resp.cookies()).post();
    
    // * Extract results (entity numbers in this sample code)
    for (Element entityNumber : searchResults.select("table[id$=SearchResults_Corp] > tbody > tr > td:first-of-type:not(td[colspan=5])")) {
        System.out.println(entityNumber.text());
    }
    
    public static void checkElement(String name, Element elem) {
        if (elem == null) {
            throw new RuntimeException("Unable to find " + name);
        }
    }
    

    OUTPUT (as of this writing)

    C3036475
    C3027305
    C3236514
    C3027304
    C3034012
    C3035110
    C3028330
    C3035378
    C3124793
    C3734637
    

    See also:

    In this example, we will log into the GitHub website by using the FormElement class.

    // # Constants used in this example
    final String USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"; 
    final String LOGIN_FORM_URL = "https://github.com/login";
    final String USERNAME = "yourUsername";  
    final String PASSWORD = "yourPassword";  
    
    // # Go to login page
    Connection.Response loginFormResponse = Jsoup.connect(LOGIN_FORM_URL)
                                                 .method(Connection.Method.GET)
                                                 .userAgent(USER_AGENT)
                                                 .execute();  
    
    // # Fill the login form
    // ## Find the form first...
    FormElement loginForm = (FormElement)loginFormResponse.parse()
                                             .select("div#login > form").first();
    checkElement("Login Form", loginForm);
    
    // ## ... then "type" the username ...
    Element loginField = loginForm.select("#login_field").first();
    checkElement("Login Field", loginField);
    loginField.val(USERNAME);
    
    // ## ... and "type" the password
    Element passwordField = loginForm.select("#password").first();
    checkElement("Password Field", passwordField);
    passwordField.val(PASSWORD);        
    
    
    // # Now send the form for login
    Connection.Response loginActionResponse = loginForm.submit()
             .cookies(loginFormResponse.cookies())
             .userAgent(USER_AGENT)  
             .execute();
    
    System.out.println(loginActionResponse.parse().html());
    
    public static void checkElement(String name, Element elem) {
        if (elem == null) {
            throw new RuntimeException("Unable to find " + name);
        }
    }
    

    All the form data is handled by the FormElement class for us (even the form method detection). A ready made Connection is built when invoking the FormElement#submit method. All we have to do is to complete this connection with addional headers (cookies, user-agent etc) and execute it.

    0 讨论(0)
提交回复
热议问题