Grabbing just the URL of an href using HTMLAgilityPack

对着背影说爱祢 提交于 2020-01-06 08:20:24

问题


Here is the HTML source I'm trying to parse:

<a style='white-space: nowrap;' href='/AuthorStories-4931/dreamfall.htm'><img class='donoricon' alt='(Current Donor)'  title='(Current Donor)' src='http://static.tthf.me/images/donors/Current%20Donor.gif'/>dreamfall</a>

Here is the code I'm using:

authorLink = doc.DocumentNode.SelectSingleNode("//a[contains(@href, 'AuthorStories')]").OuterHtml;

This grabs the link correctly, but it also captures the img as well. The only part I want to grab is the href segment. Any suggestions on how to parse out just that particular section?


回答1:


[Haven't touched HtmlAgilityPack in a few years, but this should be generally true]

Instead of taking OuterHtml, there should be an Attributes array on the node returned by SelectSingleNode, you should be able to get href from there.



来源:https://stackoverflow.com/questions/12985673/grabbing-just-the-url-of-an-href-using-htmlagilitypack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!