Screen scraping web page after delay

后端 未结 4 762
盖世英雄少女心
盖世英雄少女心 2020-12-20 07:49

I\'m trying to scrape a web page using C#, however after the page loads, it executes some javascript which loads more elements into the DOM which I need to scrape. A standar

相关标签:
4条回答
  • 2020-12-20 08:03

    The approach you have will not work regardless how long you wait, you need a browser to execute the javascript (or something that understands javascript).

    Try this question: What's a good tool to screen-scrape with Javascript support?

    0 讨论(0)
  • 2020-12-20 08:05

    You would need to execute the javascript yourself to get this functionality. Currently, your code only receives whatever the server replies with at the URL you request. The rest of the listings are "showing up" because the browser downloads, parses, and executes the accompanying javascript.

    0 讨论(0)
  • 2020-12-20 08:17

    The answer to this similar question says to use a web browser control to read the page in and process it before scraping it. Perhaps with some kind of timer delay to give the javascript some time to execute and return results.

    0 讨论(0)
  • 2020-12-20 08:21

    To execute the JavaScript I use webkit to render the page, which is the engine used by Chrome and Safari. Here is an example using its Python bindings.

    Webkit also has .NET bindings but I haven't used them.

    0 讨论(0)
提交回复
热议问题