Parsing tables, cells with Html agility in C#

前端 未结 2 1561
滥情空心
滥情空心 2021-01-07 08:09

I need to parse Html code. More specifically, parse each cell of every rows in all tables. Each row represent a single object and each cell represent different properties.

2条回答
  •  鱼传尺愫
    2021-01-07 08:30

    What I had meant in my comment was that you're doing in code (the nested loops) what having the right XPath can do for you. Using LINQ-to-XML can make this even more simpler to write. But now that we see how you want your XML file formatted, we can offer our own answers. I'd write the ParseHtml() method like so:

    public void ParseHtml()
    {
        var htmlDoc = new HtmlDocument();
        htmlDoc.LoadHtml(htmlCode);
        var cells = htmlDoc.DocumentNode
                                        // use the right XPath rather than looping manually
                           .SelectNodes(@"//tr/tr/td[@class='statBox']")
                           .Select(node => node.InnerText.Trim())
                           .ToList();
        var elementNames = new[] { "Name", "Team", "Pos", "GP", "G", "A", "PlusMinus", "PIM", "PP", "SH", "GW", "OT", "Shots", "ShotPctg", "TOIPerGame", "ShiftsPerGame", "FOWinPctg", "UnknownField" };
        var xmlDoc =
            new XElement("Stats", new XAttribute("Date", DateTime.Now.ToShortDateString()),
                new XElement("Player", new XAttribute("Rank", cells.First()),
                    // generate the elements based on the parsed cells
                    cells.Skip(1)
                         .Zip(elementNames, (Value, Name) => new XElement(Name, Value))
                         .Where(element => !String.IsNullOrEmpty(element.Value))
                )
            );
    
        // save to your file
        xmlDoc.Save(filepath);
    }
    

    Produces the output:

    
    
      
        Sidney Crosby
        PIT
        C
        39
        32
        33
        20
        29
        10
        1
        3
        0
        154
        20.8
        21:54
        22.6
        55.7
      
    
    

提交回复
热议问题