remove only some html tags on c#

后端 未结 5 2171
猫巷女王i
猫巷女王i 2021-01-03 09:06

I have a string:

string hmtl = "
xpto

and need to remove the tags of

相关标签:
5条回答
  • 2021-01-03 09:31
    html = Regex.Replace(html,@"<*DIV>", String.Empty);
    
    0 讨论(0)
  • 2021-01-03 09:44

    Use Regex:

    var result = Regex.Replace(html, @"</?DIV>", "");
    

    UPDATED

    as you mentioned, by this code, regex removes all tages else B

    var hmtl = "<DIV><B> xpto </B></DIV>";
    var remainTag = "B";
    var pattern = String.Format("(</?(?!{0})[^<>]*(?<!{0})>)", remainTag );
    var result =  Regex.Replace(hmtl , pattern, "");
    
    0 讨论(0)
  • 2021-01-03 09:45

    you can use regular

    <[(/body|html)\s]*>
    

    in c#:

     var result = Regex.Replace(html, @"<[(/body|html)\s]*>", "");
    
    <html>
    <body>
    < / html> 
    < / body>
    
    0 讨论(0)
  • 2021-01-03 09:47

    If you are just removing div tags, this will get div tags as well as any attributes they may have.

    var html = 
      "<DIV><B> xpto <div text='abc'/></B></DIV><b>Other text <div>test</div>" 
    
    var pattern = "@"(\</?DIV(.*?)/?\>)"";  
    
    // Replace any match with nothing/empty string
    Regex.Replace(html, pattern, string.Empty, RegexOptions.IgnoreCase);
    

    Result

    <B> xpto </B><b>Other text test
    
    0 讨论(0)
  • 2021-01-03 09:51

    Use htmlagilitypack

    HtmlDocument doc = new HtmlDocument();
    doc.LoadHtml("<html>yourHtml</html>");
    
    foreach(var item in doc.DocumentNode.SelectNodes("//div"))// "//div" is a xpath which means select div nodes that are anywhere in the html
    {
     item.InnerHtml;//your div content
    }
    

    If you want only B tags..

    foreach(var item in doc.DocumentNode.SelectNodes("//B"))
        {
         item.OuterHtml;//your B tag and its content
        }
    
    0 讨论(0)
提交回复
热议问题