Html parsing with JSoup

后端 未结 5 988
天涯浪人
天涯浪人 2021-01-01 08:17

I am trying to parse the html of the following URL:

http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/

to obtain the te

相关标签:
5条回答
  • 2021-01-01 08:30

    Here's a short example:

    // Connect to the website and parse it into a document
    Document doc = Jsoup.connect("http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/").get();
    
    // Select all elements you need (se below for documentation)
    Elements elements = doc.select("div[class=chpstaff] p");
    
    // Get the text of the first element
    String instructor = elements.first().text();
    
    // eg. print the result
    System.out.println(instructor);
    

    Take a look at the documentation of the jsoup selector api here: Jsoup Codebook
    Its not very difficult to use but very powerful.

    0 讨论(0)
  • 2021-01-01 08:34

    I don't know anything about JSoup, but it seems like if you wanted the instructors name you could access it with something like:

    Element instructor = doc.select("div.chpstaff div p");
    
    0 讨论(0)
  • 2021-01-01 08:37

    Here is a code

    Document document = Jsoup.connect("http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-fall-2002/").get();
    
            Elements elements = document.select("p");
            System.out.println(elements.html());
    

    You can select all tags using Selector property of Jsoup. It will return the text and tags of

    .

    0 讨论(0)
  • 2021-01-01 08:53

    may be u already solved but i worked on it so cant resist to submit

    import java.io.IOException;
    import java.util.logging.*;
    import org.jsoup.*;
    import org.jsoup.nodes.*;
    import org.jsoup.select.*;
    public class JavaApplication17 {
    
    public static void main(String[] args) {
    
    try {
       String url = "http://ocw.mit.edu/courses/aeronautics-and-astronautics/16-050-thermal-energy-   fall-2002/";
      Document doc = Jsoup.connect(url).get();
      Elements paragraphs = doc.select("p");
      for(Element p : paragraphs)
        System.out.println(p.text());
    
    } 
    catch (IOException ex) {
      Logger.getLogger(JavaApplication17.class.getName())
            .log(Level.SEVERE, null, ex);
       }
      }
    }
    
    is it what u meant?
    
    0 讨论(0)
  • 2021-01-01 08:54
            Elements ele=doc.select("p");
          ' String text=ele.text();
            System.out.println(text);
    

    Try this I think it will work

    0 讨论(0)
提交回复
热议问题