Building upon @zx81's answer, cause matching idea is really nice, I've added Java 9 results
call, which returns a Stream
. Since OP wanted to use split
, I've collected to String[]
, as split
does.
Caution if you have spaces after your comma-separators (a, b, "c,d"
). Then you need to change the pattern.
Jshell demo
$ jshell
-> String so = "123,test,444,\"don't split, this\",more test,1";
| Added variable so of type String with initial value "123,test,444,"don't split, this",more test,1"
-> Pattern.compile("\"[^\"]*\"|[^,]+").matcher(so).results();
| Expression value is: java.util.stream.ReferencePipeline$Head@2038ae61
| assigned to temporary variable $68 of type java.util.stream.Stream<MatchResult>
-> $68.map(MatchResult::group).toArray(String[]::new);
| Expression value is: [Ljava.lang.String;@6b09bb57
| assigned to temporary variable $69 of type String[]
-> Arrays.stream($69).forEach(System.out::println);
123
test
444
"don't split, this"
more test
1
Code
String so = "123,test,444,\"don't split, this\",more test,1";
Pattern.compile("\"[^\"]*\"|[^,]+")
.matcher(so)
.results()
.map(MatchResult::group)
.toArray(String[]::new);
Explanation
- Regex
[^"]
matches: a quote, anything but a quote, a quote.
- Regex
[^"]*
matches: a quote, anything but a quote 0 (or more) times , a quote.
- That regex needs to go first to "win", otherwise matching anything but a comma 1 or more times - that is:
[^,]+
- would "win".
results()
requires Java 9 or higher.
- It returns
Stream<MatchResult>
, which I map using group()
call and collect to array of Strings. Parameterless toArray()
call would return Object[]
.