How can I retrieve and parse just the html returned from an URL?

眉间皱痕 提交于 2020-02-25 05:15:25

问题


I want to be able to programmatically (without it displaying in the browser) send an URL such as http://www.amazon.com/s/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=platypi&sprefix=platypi%2Caps&rh=i%3Aaps%2Ck%3Aplatypi" and get back in a string (or some more appropriate data type?) the html results of the page (the interesting part, anyway) so that I could parse that and reformat selected parts of it as matched text and images (which link to the appropriate page). I want to do this with Razor/Web Pages, if that makes any difference.

IOW, this is sort of a screen-scraping question, but really a "behind-the-screen" scraping.

Is it possible? How? A 100 point post-answer-bonus will be awarded to the (or the most helpful) answer.


回答1:


Use the WebClient class (or .Net 4.5's better HttpClient class) to download the HTML, then use HTML AgilityPack to parse it



来源:https://stackoverflow.com/questions/16470458/how-can-i-retrieve-and-parse-just-the-html-returned-from-an-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!