Using C# regular expressions to remove HTML tags

前端 未结 10 1707
悲&欢浪女
悲&欢浪女 2020-11-22 05:59

How do I use C# regular expression to replace/remove all HTML tags, including the angle brackets? Can someone please help me with the code?

10条回答
  •  鱼传尺愫
    2020-11-22 06:50

    As often stated before, you should not use regular expressions to process XML or HTML documents. They do not perform very well with HTML and XML documents, because there is no way to express nested structures in a general way.

    You could use the following.

    String result = Regex.Replace(htmlDocument, @"<[^>]*>", String.Empty);
    

    This will work for most cases, but there will be cases (for example CDATA containing angle brackets) where this will not work as expected.

提交回复
热议问题