remove only some html tags on c#

后端未结

关注

 5  2182

I have a string:

string hmtl = " xpto

and need to remove the tags of

相关标签:

5条回答

情深已故

2021-01-03 09:31
```
html = Regex.Replace(html,@"<*DIV>", String.Empty);
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

慢半拍i

2021-01-03 09:44

Use Regex:

var result = Regex.Replace(html, @"</?DIV>", "");

UPDATED

as you mentioned, by this code, regex removes all tages else B

var hmtl = "<DIV><B> xpto </B></DIV>";
var remainTag = "B";
var pattern = String.Format("(</?(?!{0})[^<>]*(?<!{0})>)", remainTag );
var result =  Regex.Replace(hmtl , pattern, "");

0 讨论(0)

遇见更好的自我

2021-01-03 09:45

you can use regular

<[(/body|html)\s]*>

in c#:

 var result = Regex.Replace(html, @"<[(/body|html)\s]*>", "");

<html>
<body>
< / html> 
< / body>

0 讨论(0)

伪装坚强ぢ

2021-01-03 09:47

If you are just removing div tags, this will get div tags as well as any attributes they may have.

var html = 
  "<DIV><B> xpto <div text='abc'/></B></DIV><b>Other text <div>test</div>" 

var pattern = "@"(\</?DIV(.*?)/?\>)"";  

// Replace any match with nothing/empty string
Regex.Replace(html, pattern, string.Empty, RegexOptions.IgnoreCase);

Result

<B> xpto </B><b>Other text test

0 讨论(0)

借酒劲吻你

2021-01-03 09:51

Use htmlagilitypack

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml("<html>yourHtml</html>");

foreach(var item in doc.DocumentNode.SelectNodes("//div"))// "//div" is a xpath which means select div nodes that are anywhere in the html
{
 item.InnerHtml;//your div content
}

If you want only B tags..

foreach(var item in doc.DocumentNode.SelectNodes("//B"))
    {
     item.OuterHtml;//your B tag and its content
    }

0 讨论(0)

热议问题