character count minus HTML characters C#

后端 未结 1 1631
广开言路
广开言路 2021-01-28 18:05

I\'m trying to figure out a way to count the number of characters in a string, truncate the string, then returns it. However, I need this function to NOT count HTML tags. The pr

1条回答
  •  挽巷
    挽巷 (楼主)
    2021-01-28 18:51

    Use the right tool for the problem.

    HTML is not a simple format to parse. I would advise that you use a proven, existing parser rather than rolling your own. If you know that you will only ever parse XHTML - then you could use an XML parser instead.

    These are the only reliable ways to perform operations on HTML that will preserve the semantic representation.

    Don't try to use regular expressions. HTML is not a regular language and you can only cause yourself grief and misery going in that direction.

    0 讨论(0)
提交回复
热议问题