Java Regex inconsistent groups

不羁的心 提交于 2019-12-11 23:37:19

问题


Please Refer to following question on SO:

Java: Regex not matching

My Regex groups are not consistent. My code looks like:

public class RegexTest {

    public static void main(String[] args) {

        // final String VALUES_REGEX = "^\\{([0-9a-zA-Z\\-\\_\\.]+)(?:,\\s*([0-9a-zA-Z\\-\\_\\.]*))*\\}$";
        final String VALUES_REGEX = "\\{([\\w.-]+)(?:, *([\\w.-]+))*\\}";

        final Pattern REGEX_PATTERN = Pattern.compile(VALUES_REGEX);
        final String values = "{df1_apx.fhh.irtrs.d.rrr, ffd1-afp.farr.d.rrr.asgd, ffd2-afp.farr.d.rrr.asgd}";
        final Matcher matcher = REGEX_PATTERN.matcher(values);
        if (null != values && matcher.matches()) {
            // for (int index=1; index<=matcher.groupCount(); ++index) {
            // System.out.println(matcher.group(index));
            // }

            while (matcher.find()) {
                System.out.println(matcher.group());
            }
        }

    }
}

I tried following combinations:

A) Regex as "^\{([0-9a-zA-Z\-\_\.]+)(?:,\s*([0-9a-zA-Z\-\_\.]))\}$" and use groupCount() to iterate. Result:

df1_apx.fhh.irtrs.d.rrr

ffd2-afp.farr.d.rrr.asgd

B) Regex as ^\{([0-9a-zA-Z\-\_\.]+)(?:,\s*([0-9a-zA-Z\-\_\.]))\}$" and use matcher.find(). Result: No result.

C) Regex as "\{([\w.-]+)(?:, ([\w.-]+))\}" and use groupCount() to iterate. Result:

df1_apx.fhh.irtrs.d.rrr

ffd2-afp.farr.d.rrr.asgd

D) Regex as "\{([\w.-]+)(?:, ([\w.-]+))\}" and use matcher.find(). Result: No results.

I never get consistent groups. Expected result here is:

df1_apx.fhh.irtrs.d.rrr

ffd1-afp.farr.d.rrr.asgd

ffd2-afp.farr.d.rrr.asgd

Please let me know, how can I achieve it.


回答1:


(?<=[{,])\s*(.*?)(?=,|})

You can simply use this and grab the captures.See demo.

https://regex101.com/r/sJ9gM7/33

When you have (#something)* then only the last group is remembered by the regex engine.You wont get all the groups this way.




回答2:


The problem is that you are trying to make two things at the same time:

  • you want to validate the string format
  • you want to extract each items (with an unknow number of items)

So, it's not possible using the matches method, since when you repeat the same capture group previous captures are overwritten by the last.

One possible way is to use the find method to obtain each items and to use the contiguity anchor \G to check the format. \G ensures that the current match immediatly follows the previous or the start of the string:

(?:\\G(?!\\A),\\s*|\\A\\{)([\\w.-]+)(}\\z)?

pattern details:

(?:                  # two possible begins:
    \\G(?!\\A),\\s*  # contiguous to a previous match
                     # (but not at the start of the string)
  |                  # OR
    \\A\\{           # the start of the string
)
([\\w.-]+)           # an item in the capture group 1
(}\\z)?              # the optional capture group 2 to check
                     # that the end of the string has been reached

So to check the format of the string from start to end all you need is to test if the capture group 2 exists for the last match.



来源:https://stackoverflow.com/questions/29374226/java-regex-inconsistent-groups

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!