Getting Jsoup to support dynamically generated html by JavaScript

前端 未结 1 1026
情话喂你
情话喂你 2020-11-29 10:44

right now I\'m working on a webcrawler. This one should parse some specific sites and give me an output into an xml-file. Up to this point, it\'s no problem. The Crawler wor

相关标签:
1条回答
  • 2020-11-29 11:26

    Jsoup does not support javascript and it does not emulate a browser. Just forget about it if you're planning to execute Javascript. In my experience HtmlUnit, which is a headless browser, has given me the best results (always talking about Java frameworks).

    One thing that worths trying in HtmlUnit is changing the BrowserVersion (Chrome / InternetEplorer / FireFox) while creating the WebClient instance. Some sites react in a different way and sometimes just changing that value might give you the results you expect to get.

    0 讨论(0)
提交回复
热议问题