Regular expression to replace square brackets with angle brackets

前端 未结 3 455
Happy的楠姐
Happy的楠姐 2021-01-28 19:01

I have a string like:

[a b=\"c\" d=\"e\"]Some multi line text[/a]

Now the part d=\"e\" is optional. I want to convert such type of

相关标签:
3条回答
  • 2021-01-28 19:10

    If you are actually thinking of processing (pseudo)-HTML using regexes,

    don't

    SO is filled with posts where regexes are proposed for HTML/XML and answers pointing out why this is a bad idea.

    Suppose your multiline text ("which can be anything") contains

    [a b="foo" [a b="bar"]]
    

    a regex cannot detect this.

    See the classic answer in: RegEx match open tags except XHTML self-contained tags

    which has:

    I think it's time for me to quit the post of Assistant Don't Parse HTML With Regex Officer. No matter how many times we say it, they won't stop coming every day... every hour even. It is a lost cause, which someone else can fight for a bit. So go on, parse HTML with regex, if you must. It's only broken code, not life and death. – bobince

    Seriously. Find an XML or HTML DOM and populate it with your data. Then serialize it. That will take care of all the problems you don't even know you have got.

    0 讨论(0)
  • 2021-01-28 19:12

    Would some multiline text include [ and ]? If not, you can just replace [ with < and ] with > using string.replace - no need of regex.

    Update: If it can be anything but [/a], you can replace

    ^\[a([^\]]+)](.*?)\[/a]$
    

    with

    <a$1>$2</a>
    

    I haven't escaped ] and / in the regex - escape them if necessary to get

    ^\[a([^\]]+)\](.*?)\[\/a\]$
    
    0 讨论(0)
  • 2021-01-28 19:20

    For HTML tags, please use HTML parser.

    For [a][/a], you can do like following

    Match m=Regex.Match(@"[a b=""c"" d=""e""]Some multi line text[/a]", 
                        @"\[a b=""([^""]+)"" d=""([^""]+)""\](.*?)\[/a\]",
                        RegexOptions.Multiline);
    
    m.Groups[1].Value
    "c"
    m.Groups[2].Value
    "e"
    m.Groups[3].Value
    "Some multi line text"
    

    Here is Regex.Replace (I am not that prefer though)

    string inputStr = @"[a b=""[[[[c]]]]"" d=""e[]""]Some multi line text[/a]";
    string resultStr=Regex.Replace(inputStr,
                                @"\[a( b=""[^""]+"")( d=""[^""]+"")?\](.*?)\[/a\]",
                                @"<a$1$2>$3</a>", 
                                RegexOptions.Multiline);
    
    0 讨论(0)
提交回复
热议问题