Equivalent to StringTokenizer with multiple characters delimiters

前端 未结 3 1434
你的背包
你的背包 2021-01-18 07:34

I try to split a String into tokens.

The token delimiters are not single characters, some delimiters are included into others (example, & and &&), and

相关标签:
3条回答
  • 2021-01-18 08:28

    Split won't do it for you as it removed the delimeter. You probably need to tokenize the string on your own (i.e. a for-loop) or use a framework like http://www.antlr.org/

    0 讨论(0)
  • 2021-01-18 08:34

    Try this:

    String test = "a & b&&c=>d=A";
    String regEx = "(&[&]?|=[>]?)";
    
    String[] res = test.split(regEx);
    for(String s : res){
        System.out.println("Token: "+s);
    }
    

    I added the '=A' at the end to show that that is also parsed.

    As mentioned in another answer, if you need the atypical behaviour of keeping the delimiters in the result, you will probably need to create you parser yourself....but in that case you really have to think about what a "delimiter" is in your code.

    0 讨论(0)
  • 2021-01-18 08:35

    You can use the Pattern and a simple loop to achieve the results that you are looking for:

    List<String> res = new ArrayList<String>();
    Pattern p = Pattern.compile("([&]{1,2}|=>?| +)");
    String s = "s=a&=>b";
    Matcher m = p.matcher(s);
    int pos = 0;
    while (m.find()) {
        if (pos != m.start()) {
            res.add(s.substring(pos, m.start()));
        }
        res.add(m.group());
        pos = m.end();
    }
    if (pos != s.length()) {
        res.add(s.substring(pos));
    }
    for (String t : res) {
        System.out.println("'"+t+"'");
    }
    

    This produces the result below:

    's'
    '='
    'a'
    '&'
    '=>'
    'b'
    
    0 讨论(0)
提交回复
热议问题