Html Agility Pack get specific content from a tag

问题

I need some text from this website https://www.amazon.com/dp/B074J9SSPD, to be specific, I need to extract data under the "About the Product" section.

I tried

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = new HtmlDocument();
doc = web.Load("https://amazon.com/dp/B074J9SSPD");

foreach(var node in doc.DocumentNode.SelectNodes("//li[@class='showHiddenFeatureBullets']") {
  string ar = node.InnerText;
  HtmlAttribute att = node.Attributes["class"];
  MessageBox.Show(ar.ToString());
  if (att.Value.Contains("showHiddenFeatureBulletsway,

  }
}

Plz suggest the right way , I'm getting blank string.

回答1:

Your original code (before that first edit) worked for me it just was missing the right parentheses on the foreach loop. I also broke out the nodes into it's own variable to make it easier to read but this should work for you. I tested it locally and it worked for me.

HtmlWeb web = new HtmlWeb();
HtmlDocument doc = new HtmlDocument();
doc = web.Load("https://amazon.com/dp/B074J9SSPD");

var aboutProductNodes = doc.DocumentNode.SelectNodes("//li[@class='showHiddenFeatureBullets']");

foreach (var node in aboutProductNodes)
{
    string ar = node.InnerText;
    HtmlAttribute att = node.Attributes["class"];
    MessageBox.Show(ar.ToString().Trim());
    if (att.Value.Contains("showHiddenFeatureBullets"))
    {

    }
}

However I would suggest looking into the amazon API. It worked about half the time and then the other half was Amazon replying to use their api and not web scrape them. So that might have been a part of your problem too.

https://developer.amazon.com/services-and-apis

来源：https://stackoverflow.com/questions/52670970/html-agility-pack-get-specific-content-from-a-li-tag

标签

html-agility-pack

Html Agility Pack get specific content from a <li> tag

问题

回答1: