java Regex - split but ignore text inside quotes?

后端 未结 4 1989
深忆病人
深忆病人 2021-01-15 18:35

using only regular expression methods, the method String.replaceAll and ArrayList how can i split a String into tokens, but ignore delimiters that exist inside quotes? the

相关标签:
4条回答
  • 2021-01-15 19:14

    You cannot in any reasonable way. You are posing a problem that regular expressions aren't good at.

    0 讨论(0)
  • 2021-01-15 19:17

    I know there is a damn good and accepted answer already present but I would like to add another regex based (and may I say simpler) approach to split the given text using any non-alphanumeric delimiter which not inside the single quotes using

    Regex:

    /(?=(([^']+'){2})*[^']*$)[^a-zA-Z\\d]+/
    

    Which basically means match a non-alphanumeric text if it is followed by even number of single quotes in other words match a non-alphanumeric text if it is outside single quotes.

    Code:

    String string = "hello^world'this*has two tokens'#2ndToken";
    System.out.println(Arrays.toString(
         string.split("(?=(([^']+'){2})*[^']*$)[^a-zA-Z\\d]+"))
    );
    

    Output:

    [hello, world'this*has two tokens', 2ndToken]
    

    Demo:

    Here is a live working Demo of the above code.

    0 讨论(0)
  • 2021-01-15 19:17

    Do not use a regular expression for this. It won't work. Use / write a parser instead.

    You should use the right tool for the right task.

    0 讨论(0)
  • 2021-01-15 19:27

    Use a Matcher to identify the parts you want to keep, rather than the parts you want to split on:

    String s = "hello^world'this*has two tokens'";
    Pattern pattern = Pattern.compile("([a-zA-Z0-9]+|'[^']*')+");
    Matcher matcher = pattern.matcher(s);
    while (matcher.find()) {
        System.out.println(matcher.group(0));
    }
    

    See it working online: ideone

    0 讨论(0)
提交回复
热议问题