Regex help required

前端 未结 3 1239
天命终不由人
天命终不由人 2021-01-24 02:38

I am trying to replace two or more occurences of
(like


) tags together with two

相关标签:
3条回答
  • 2021-01-24 03:05

    You can do that changing a little your regex:

    Pattern brTagPattern = Pattern.compile("<\\s*br\\s*/\\s*>\\s*<\\s*br\\s*/\\s*>", Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    

    This will ignore every spaces between two
    . If you just want exactly 2 or three, you can use:

    Pattern brTagPattern = Pattern.compile("<\\s*br\\s*/\\s*>(\\s){2,3}<\\s*br\\s*/\\s*>", Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
    
    0 讨论(0)
  • 2021-01-24 03:08

    Here's some Groovy code to test your Pattern:

    import java.util.regex.*
    
    Pattern brTagPattern = Pattern.compile( "(<\\s*br\\s*/\\s*>\\s*){2,}", Pattern.CASE_INSENSITIVE | Pattern.DOTALL )
    def testData = [
      ['',                            ''],
      ['<br/>',                       '<br/>'],
      ['< br/> <br />',               '<br/><br/>'],
      ['<br/> <br/><br/>',            '<br/><br/>'],
      ['<br/>   < br/ > <br/>',       '<br/><br/>'],
      ['<br/> <br/>   <br/>',         '<br/><br/>'],
      ['<br/><br/><br/> <br/><br/>',  '<br/><br/>'],
      ['<br/><br/><br/><b>w</b><br/>','<br/><br/><b>w</b><br/>'],
     ]
    
    testData.each { inputStr, expected ->
      Matcher matcher = brTagPattern.matcher( inputStr )
      assert expected == matcher.replaceAll( '<br/><br/>' )
    }
    

    And everything seems to pass fine...

    0 讨论(0)
  • 2021-01-24 03:19

    Probably not the answer you want to hear, but it is general wisdom that you should not attempt to parse XML/HTML with regular expressions. So many things can go wrong -- it's a much better idea to use a parsing library specifically meant for such data, which will also completely bypass the issue you're having.

    Take a look at JAXB if you are certain your HTML is well-formed XML, or if the HTML is likely to be messy and incompliant (like most real-world HTML) you should try something like TagSoup.

    0 讨论(0)
提交回复
热议问题