I try to split a String into tokens.
The token delimiters are not single characters, some delimiters are included into others (example, & and &&), and
Split won't do it for you as it removed the delimeter. You probably need to tokenize the string on your own (i.e. a for-loop) or use a framework like http://www.antlr.org/
Try this:
String test = "a & b&&c=>d=A";
String regEx = "(&[&]?|=[>]?)";
String[] res = test.split(regEx);
for(String s : res){
System.out.println("Token: "+s);
}
I added the '=A' at the end to show that that is also parsed.
As mentioned in another answer, if you need the atypical behaviour of keeping the delimiters in the result, you will probably need to create you parser yourself....but in that case you really have to think about what a "delimiter" is in your code.
You can use the Pattern and a simple loop to achieve the results that you are looking for:
List<String> res = new ArrayList<String>();
Pattern p = Pattern.compile("([&]{1,2}|=>?| +)");
String s = "s=a&=>b";
Matcher m = p.matcher(s);
int pos = 0;
while (m.find()) {
if (pos != m.start()) {
res.add(s.substring(pos, m.start()));
}
res.add(m.group());
pos = m.end();
}
if (pos != s.length()) {
res.add(s.substring(pos));
}
for (String t : res) {
System.out.println("'"+t+"'");
}
This produces the result below:
's'
'='
'a'
'&'
'=>'
'b'