Java string - get everything between (but not including) two regular expressions?

前端 未结 2 575
独厮守ぢ
独厮守ぢ 2020-12-13 11:24

In Java, is there a simple way to extract a substring by specifying the regular expression delimiters on either side, without including the delimiters in the final substring

相关标签:
2条回答
  • 2020-12-13 11:38

    Write a regex like this:

    "(regex1)(.*)(regex2)"
    

    ... and pull out the middle group from the matcher (to handle newlines in your pattern you want to use Pattern.DOTALL).

    Using your example we can write a program like:

    package test;
    
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    
    public class Regex {
    
        public static void main(String[] args) {
            Pattern p = Pattern.compile(
                    "<row><column>(.*)</column></row>",
                    Pattern.DOTALL
                );
    
            Matcher matcher = p.matcher(
                    "<row><column>Header\n\n\ntext</column></row>"
                );
    
            if(matcher.matches()){
                System.out.println(matcher.group(1));
            }
        }
    
    }
    

    Which when run prints out:

    Header
    
    
    text
    
    0 讨论(0)
  • 2020-12-13 11:55

    You should not use regular expressions to decode XML - this will eventually break if the input is not strictly controlled.

    The easiest thing is probably to parse the XML up in a DOM tree (Java 1.4 and newer contain a XML parser directly) and then navigate the tree to pick out what you need.

    Perhaps you would like to tell what you want to accomplish with your program?

    0 讨论(0)
提交回复
热议问题