Java - Obtain text within script tag using Jsoup

后端 未结 4 1965
渐次进展
渐次进展 2021-01-01 12:19

I am using the Jsoup library to read a URL. This url has text within a few

相关标签:
4条回答
  • 2021-01-01 12:30

    Yes. You can use Element#getElementsByTag() to get all the script tag . Each script tags will be represented by the DataNode.

     Document doc =Jsoup.connect("http://stackoverflow.com/questions/16780517/java-obtain-text-within-script-tag-using-jsoup").timeout(10000).get();
     Elements scriptElements = doc.getElementsByTag("script");
    
     for (Element element :scriptElements ){                
            for (DataNode node : element.dataNodes()) {
                System.out.println(node.getWholeData());
            }
            System.out.println("-------------------");            
      }
    
    0 讨论(0)
  • 2021-01-01 12:30
    Document doc = Jsoup.parse(html);
    Elements scripts = doc.getElementsByTag("script");
    for (Element script : scripts) {
        System.out.println(script.data());
    }
    
    0 讨论(0)
  • 2021-01-01 12:41

    Alternatively, you could use the Element#html() method that returns the inner html of an element.

    Since 1.11.1: Use efficient Element#selectFirst() method to find the script element.

    Document doc = Jsoup.connect("http://www.example.com").timeout(10000).get();
    Element scriptElement = doc.selectFirst("script");
    
    // Don't forget to check scriptElement is not null...
    
    String jsCode = scriptElement.html(); 
    

    Up to Jsoup 1.10.3: Combine Element#select() and Elements#first() calls to find the script element.

    Document doc = Jsoup.connect("http://www.example.com").timeout(10000).get();
    Element scriptElement = doc.select("script").first();
    
    // Don't forget to check scriptElement is not null...
    
    String jsCode = scriptElement.html(); 
    
    0 讨论(0)
  • 2021-01-01 12:47

    According to your case the solution will be as below.

    Document doc = Jsoup.connect("http://www.example.com").timeout(10000).get();
    Elements scripts = doc.select("script");
    
    for (Element script : scripts) {
        String type = script.attr("type");
        if (type.contentEquals("text/javascript")) {
            String scriptData = script.data(); // your text from the script
            break;
        }
    }
    
    0 讨论(0)
提交回复
热议问题