Regular Expression to Split String based on space and matching quotes in java

后端 未结 3 686
一整个雨季
一整个雨季 2021-01-27 18:54

I have a String which i need to split based on the space and the exact matching quotes.

If the

string = \"It is fun \\\"to write\\\" regular\\\"expressi         


        
3条回答
  •  攒了一身酷
    2021-01-27 19:34

    It seems that you just used regex from this answer, but as you could see it doesn't use split but find method from Matcher class. Also this answer takes care of ' where your input shows no signs of it.

    So you can improve this regex by removing parts handling ' which will make it look like

    [^\\s\"]+|\"([^\"]*)\"
    

    Also since you want to include " as part of token then you don't need to place match from between " in separate group, so get rid of parenthesis in \"([^\"]*)\" part

    [^\\s\"]+|\"[^\"]*\"
    

    Now all you need to do is add case where there will be no closing ", but instead you will get end of string. So change this regex to

    [^\\s\"]+|\"[^\"]*(\"|$)
    

    After this you can just use Matcher, find all store tokens somewhere, lets say in List.

    Example:

    String data = "It is fun \"to write\" regular\"expression";
    List matchList = new ArrayList();
    Pattern regex = Pattern.compile("[^\\s\"]+|\"[^\"]*(\"|$)");
    Matcher regexMatcher = regex.matcher(data);
    while (regexMatcher.find()) {
        System.out.println(regexMatcher.group());
        matchList.add(regexMatcher.group());
    }
    

    Output:

    It
    is
    fun
    "to write"
    regular
    "expression
    

    More complex expression to handle handle this data can look like

    String data = "It is fun \"to write\" regular \"expression";
    for(String s : data.split("(?

    but this approach is way overcomplicated then writing your own parser.


    Such parser could look like

    public static List parse(String data) {
        List tokens = new ArrayList();
        StringBuilder sb = new StringBuilder();
        boolean insideQuote = false;
        char previous = '\0';
    
        for (char ch : data.toCharArray()) {
            if (ch == ' ' && !insideQuote) {
                if (sb.length() > 0 && previous != '"')
                    addTokenAndResetBuilder(sb, tokens);
            } else if (ch == '"') {
                if (insideQuote) {
                    sb.append(ch);
                    addTokenAndResetBuilder(sb, tokens);
                } else {
                    addTokenAndResetBuilder(sb, tokens);
                    sb.append(ch);
                }
                insideQuote = !insideQuote;
            } else {
                sb.append(ch);
            }
            previous = ch;
        }
        addTokenAndResetBuilder(sb, tokens);
    
        return tokens;
    }
    
    private static void addTokenAndResetBuilder(StringBuilder sb, List list) {
        if (sb.length() > 0) {
            list.add(sb.toString());
            sb.delete(0, sb.length());
        }
    }
    

    Usage

    String data = "It is fun \"to write\" regular\"expression\"xxx\"yyy";
    for (String s : parse(data))
        System.out.println(s);
    

提交回复
热议问题