Executing scraped JavaScript with cheerio

两盒软妹~` 提交于 2019-12-04 03:31:22

问题


I have a web page in which there are some JS APIs that don't alter the dom, but return some numbers. I'd like to write a NodeJS application that downloads such pages and executes those functions in the context of the downloaded page.

I was looking at cheerio for page scraping.. but while I see how easy is it to navigate and manipulate the DOM with it, I don't see any access to running the page functions. Is it possible to do it?

Should I look, instead, at jsdom?

Thanks


回答1:


Sounds like you want to use PhantomJS, which will provide the fully rendered output, and then use cheerio on that.




回答2:


Cheerio and jsdom are both HTML scrapers and have no notion of executing JavaScript. If the API you wish to access is written in JavaScript, there is little to prevent you from extracting them and running them inside node. Beware though, downloading/executing arbitrary JavaScript can pose a huge security risk. If you want to simulate the behaviour of a browser, look at http://phantomjs.org/. This is a headless browser for Node and can do everything an ordinary browser can as well.



来源:https://stackoverflow.com/questions/15025800/executing-scraped-javascript-with-cheerio

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!