How to get text from this html page with jsoup?

前端 未结 1 1933
忘了有多久
忘了有多久 2020-12-12 05:08

I am using this code to retreive the text in the main article on this page.

public class HtmlparserExampleActivity extends Activity {
String outputtext;
  Ta         


        
相关标签:
1条回答
  • 2020-12-12 05:41

    Here's a simplified extract of relevance from your question:

    Document doc = Jsoup.connect("http://movies.ign.com/articles/100/1002569p1.html").get();
    Elements elementsHtml = doc.getElementsByTag("main-article-content");  
    // ...
    

    You're making a fundamental mistake here. There are no HTML tags like <main-article-content> in the document. However, there's a <div id="main-article-content">. According the CSS selector overview about halfway this Jsoup cookbook, you should be using #id selector.

    Document doc = Jsoup.connect("http://movies.ign.com/articles/100/1002569p1.html").get();
    Element mainArticleContent = doc.select("#main-article-content").first();  
    // ...
    
    0 讨论(0)
提交回复
热议问题