Saving substrings using Regular Expressions

北城余情 提交于 2019-12-02 01:09:30

Yes. You wrap it in "capturing groups", which is just some ( ) around the part of the regular expression matching the interesting word.

Here is an example:

public static void main(String[] args) {

    Pattern pat = Pattern.compile("testing (\\d+) widgets");

    String text = "testing 5 widgets";

    Matcher matcher = pat.matcher(text);

    if (matcher.matches()) {
        System.out.println("Widgets tested : " + matcher.group(1));
    } else {
        System.out.println("No match");
    }

}

Pattern and Matcher come from java.util.regex. There are some shortcuts in the String class, but these are the most flexible

The problem specification isn't very clear, but here are some ideas that may work:

Use lookarounds and replaceAll/First

The following regex matches the \w+ that is preceded by the string "{item " and followed by the string " [". Lookarounds are used to match exactly the \w+ only. Metacharacters { and [ are escaped as necessary.

String text =
    "Person item6 [can {item thing [wrap]}]\n" +
    "Cat item7 [meow meow {item thang [purr]}]\n" +
    "Dog item8 [maybe perhaps {itemmmm thong [woof]}]" ;

String LOOKAROUND_REGEX = "(?<=\\{item )\\w+(?= \\[)";

System.out.println(
    text.replaceAll(LOOKAROUND_REGEX, "STUFF")
);

This prints:

Person item6 [can {item STUFF [wrap]}]
Cat item7 [meow meow {item STUFF [purr]}]
Dog item8 [maybe perhaps {itemmmm thong [woof]}]

References


Use capturing groups instead of lookarounds

Lookarounds should be used judiciously. Lookbehinds in particular in Java is very limited. A more commonly applied technique is to use capturing groups to match more than just the interesting parts.

The following regex matches a similar pattern from before, \w+, but also includes the "{item " prefix and " [" suffix. Additionally, the m in item can repeat without limitation (something that can't be matched in a lookbehind in Java).

String CAPTURING_REGEX = "(\\{item+ )(\\w+)( \\[)";

System.out.println(
    text.replaceAll(CAPTURING_REGEX, "$1STUFF$3")
);

This prints:

Person item6 [can {item STUFF [wrap]}]
Cat item7 [meow meow {item STUFF [purr]}]
Dog item8 [maybe perhaps {itemmmm STUFF [woof]}]

Our pattern has 3 capturing groups:

(\{item+ )(\w+)( \[)
\________/\___/\___/
 group 1    2    3

Note that we can't simply replace what we matched with "STUFF", because we match some "extraneous" parts. We're not interested in replacing them, so we capture these parts and just put them back in the replacement string. The way we refer to what a group captured in replacement strings in Java is to use the $ sigil; thus the $1 and $3 in the above example.

References


Use a Matcher for more flexibility

Not everything can be done with replacement strings. Java doesn't have postprocessing to capitalize a captured string, for example. In these more general replacement scenarios, you can use a Matcher loop like the following:

Matcher m = Pattern.compile(CAPTURING_REGEX).matcher(text);
StringBuffer sb = new StringBuffer();
while (m.find()) {
    System.out.println("Match found");
    for (int i = 0; i <= m.groupCount(); i++) {
        System.out.printf("Group %d captured <%s>%n", i, m.group(i));
    }
    m.appendReplacement(sb,
        String.format("%s%s %<s and more %<SS%s",
            m.group(1), m.group(2), m.group(3)
        )
    );
}
m.appendTail(sb);

System.out.println(sb.toString());

The above prints:

Match found
Group 0 captured <{item thing [>
Group 1 captured <{item >
Group 2 captured <thing>
Group 3 captured < [>

Match found
Group 0 captured <{item thang [>
Group 1 captured <{item >
Group 2 captured <thang>
Group 3 captured < [>

Match found
Group 0 captured <{itemmmm thong [>
Group 1 captured <{itemmmm >
Group 2 captured <thong>
Group 3 captured < [>

Person item6 [can {item thing thing and more THINGS [wrap]}]
Cat item7 [meow meow {item thang thang and more THANGS [purr]}]
Dog item8 [maybe perhaps {itemmmm thong thong and more THONGS [woof]}]

References

Attachments

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!