Python 3, Web-scraping, and Javascript [Oh My]

后端 未结 1 420
误落风尘
误落风尘 2021-02-04 21:31

I have come to the point of entering the melee on web-scraping webpages using Javascript, with Python3. I am well aware that my boot may be making contact with a dead horse, but

1条回答
  •  既然无缘
    2021-02-04 21:48

    When a page loads data via javascript, it has to make requests to the server to get that data via the XMLHttpRequest function (XHR). You can see what requests they are making, and then make them yourself, using wget!

    To find out which requests they are making, use the Web Inspector (Chrome and Safari) or Firebug (Firefox). Here's how to do it in Chrome:

    wrench/tools/developer tools/Network (tab at the top of the tools)/XHR filter at the bottom.

    Here's an example request they make in javascript

    If you look closely at the XHR request url, you notice that all trailing returns have the same format:

    http://performance.morningstar.com/Performance/cef/trailing-total-returns.action?t=

    You just need to specify t. For example:

    http://performance.morningstar.com/Performance/cef/trailing-total-returns.action?t=VAW http://performance.morningstar.com/Performance/cef/trailing-total-returns.action?t=INTC http://performance.morningstar.com/Performance/cef/trailing-total-returns.action?t=VHCOX

    Now you can wget those URIs and parse out the data directly.

    0 讨论(0)
提交回复
热议问题