I have a String which i need to split based on the space and the exact matching quotes.
If the
string = \"It is fun \\\"to write\\\" regular\\\"expressi
It seems that you just used regex from this answer, but as you could see it doesn't use split
but find
method from Matcher
class. Also this answer takes care of '
where your input shows no signs of it.
So you can improve this regex by removing parts handling '
which will make it look like
[^\\s\"]+|\"([^\"]*)\"
Also since you want to include "
as part of token then you don't need to place match from between "
in separate group, so get rid of parenthesis in \"([^\"]*)\"
part
[^\\s\"]+|\"[^\"]*\"
Now all you need to do is add case where there will be no closing "
, but instead you will get end of string. So change this regex to
[^\\s\"]+|\"[^\"]*(\"|$)
After this you can just use Matcher, find
all store tokens somewhere, lets say in List
.
Example:
String data = "It is fun \"to write\" regular\"expression";
List matchList = new ArrayList();
Pattern regex = Pattern.compile("[^\\s\"]+|\"[^\"]*(\"|$)");
Matcher regexMatcher = regex.matcher(data);
while (regexMatcher.find()) {
System.out.println(regexMatcher.group());
matchList.add(regexMatcher.group());
}
Output:
It
is
fun
"to write"
regular
"expression
More complex expression to handle handle this data can look like
String data = "It is fun \"to write\" regular \"expression";
for(String s : data.split("(?
but this approach is way overcomplicated then writing your own parser.
Such parser could look like
public static List parse(String data) {
List tokens = new ArrayList();
StringBuilder sb = new StringBuilder();
boolean insideQuote = false;
char previous = '\0';
for (char ch : data.toCharArray()) {
if (ch == ' ' && !insideQuote) {
if (sb.length() > 0 && previous != '"')
addTokenAndResetBuilder(sb, tokens);
} else if (ch == '"') {
if (insideQuote) {
sb.append(ch);
addTokenAndResetBuilder(sb, tokens);
} else {
addTokenAndResetBuilder(sb, tokens);
sb.append(ch);
}
insideQuote = !insideQuote;
} else {
sb.append(ch);
}
previous = ch;
}
addTokenAndResetBuilder(sb, tokens);
return tokens;
}
private static void addTokenAndResetBuilder(StringBuilder sb, List list) {
if (sb.length() > 0) {
list.add(sb.toString());
sb.delete(0, sb.length());
}
}
Usage
String data = "It is fun \"to write\" regular\"expression\"xxx\"yyy";
for (String s : parse(data))
System.out.println(s);