Regex to split BBCode into pieces

前端 未结 4 1929
你的背包
你的背包 2020-11-28 14:24

I have this:

str = \"some html code [img]......[/img] some html code [img]......[/img]\"

and I want to get this:

[\"[img]..         


        
相关标签:
4条回答
  • 2020-11-28 14:36

    Please don't use BBCode. It's evil.

    BBCode came to life when developers were too lazy to parse HTML correctly and decided to invent their own markup language. As with all products of laziness, the result is completely inconsistent, unstandardized, and widely adopted.

    Try to use a user-friendlier markup language, like Markdown (that's what Stack Overflow uses) or Textile. Both of them have parsers for Ruby:

    • Maruku for Markdown
    • RedCloth for Textile

    If you still don't want to heed to my advice and choose to go with BBCode, don't reinvent the wheel and use a BBCode parser. To answer your question directly, there is the least desirable option: use regex.

    /\[img\].*?\[\/img\]/
    

    As seen on rubular. Although I would use /\[img\](.*?)\[\/img\]/, so it will extract the contents inside the img tags. Note that this is fairly fragile and will break if there are nested img tags. Hence, the advice to use a parser.

    0 讨论(0)
  • 2020-11-28 14:42
    irb(main):001:0> str = "some html code [img]......[/img] some html \
    code [img]......[/img]"
    "some html code [img]......[/img] some html code [img]......[/img]"
    irb(main):002:0> str.scan(/\[img\].*?\[\/img\]/)
    ["[img]......[/img]", "[img]......[/img]"]
    

    Keep in mind that this is a very specific answer that is based on your exact question. Change str by, say, adding an image tag within an image tag, and all Hell will break loose.

    0 讨论(0)
  • 2020-11-28 14:48
    str = "some html code [img]......[/img] some html code [img]......[/img]"
    p str.split("[/img]").each{|x|x.sub!(/.*\[img\]/,"")}
    
    0 讨论(0)
  • 2020-11-28 14:51

    There is a ruby BBCODE parser at Google Code.

    Don't use regex for this.

    0 讨论(0)
提交回复
热议问题