问题
I am scraping a website that uses Javascript to dynamically populate the content of a website with the Html Agility pack.
Basically, I was searching for the XPATH "\\div[@class='PricingInfo']"
, but that div node was being written to the DOM via Javascript.
So, when I load the page through the Html Agility pack the XPATH mentioned above cannot be found.
It turns out there is a comment before a particular script block I want to parse.
<!--Module 328 Buying Options Table-->
<script type="text/javascript" language="JavaScript">
var data = {
price: 30.00
}
</script>
For this site, there are many script blocks and so I would need to narrow it down by the finding this auto-generated comment <!--Module 328 Buying Options Table-->
and the sibling of that node would be the correct script block.
Any idea on how I can search for a particular comment and then just get the adjacent script block?
Thank you!
回答1:
htmlDoc.DocumentNode.SelectSingleNode("//comment()[contains(., 'Buying Options')]/following-sibling::script")
来源:https://stackoverflow.com/questions/3844208/html-agility-pack-find-comment-node