I am trying to do a split on a string with comma delimiter
my $string=\'ab,12,20100401,xyz(A,B)\';
my @array=split(\',\',$string);
If I do
Here is one way that should work.
use Regexp::Common;
my $string = 'ab,12,20100401,xyz(A,B)';
my @array = ($string =~ /(?:$RE{balanced}{-parens=>'()'}|[^,])+/g);
Regexp::Common can be installed from CPAN.
There is a bug in this code, coming from the depths of Regexp::Common. Be warned that this will (unfortunately) fail to match the lack of space between ,,
.
Well, old question, but I just happened to wrestle with this all night, and the question was never marked answered, so in case anyone arrives here by Google as I did, here's what I finally got. It's a very short answer using only built-in PERL regex features:
my $string='ab,12,20100401,xyz(A,B)';
string =~ 's/((\((?>[^)(]*(?2)?)*\))|[^,()]*)(*SKIP)([,])/$1\n/g';
my @array=split('\n',$string);
Commas that are not inside parentheses are changed to newlines and then the array is split on them. This will ignore commas inside any level of nested parentheses, as long as they're properly balanced with a matching number of open and close parens.
This assumes you won't have newline \n
characters in the initial value of $string. If you need to, either temporarily replace them with something else before the substitution line and then use a loop to replace back after the split
, or just pick a different delimiter to split the array on.
Limit the number of elements it can be split into:
split(',', $string, 4)
Here's another way:
my $string='ab,12,20100401,xyz(A,B)';
my @array = ($string =~ /(
[^,]*\([^)]*\) # comma inside parens is part of the word
|
[^,]*) # split on comma outside parens
(?:,|$)/gx);
Produces:
ab
12
20100401
xyz(A,B)
use Text::Balanced qw(extract_bracketed);
my $string = "ab,12,20100401,xyz(A,B(a,d))";
my @params = ();
while ($string) {
if ($string =~ /^([^(]*?),/) {
push @params, $1;
$string =~ s/^\Q$1\E\s*,?\s*//;
} else {
my ($ext, $pre);
($ext, $string, $pre) = extract_bracketed($string,'()','[^()]+');
push @params, "$pre$ext";
$string =~ s/^\s*,\s*//;
}
}
This one supports:
Here is my attempt. It should handle depth well and could even be extended to include other bracketed symbols easily (though harder to be sure that they MATCH). This method will not in general work for quotation marks rather than brackets.
#!/usr/bin/perl
use strict;
use warnings;
my $string='ab,12,20100401,xyz(A(2,3),B)';
print "$_\n" for parse($string);
sub parse {
my ($string) = @_;
my @fields;
my @comma_separated = split(/,/, $string);
my @to_be_joined;
my $depth = 0;
foreach my $field (@comma_separated) {
my @brackets = $field =~ /(\(|\))/g;
foreach (@brackets) {
$depth++ if /\(/;
$depth-- if /\)/;
}
if ($depth == 0) {
push @fields, join(",", @to_be_joined, $field);
@to_be_joined = ();
} else {
push @to_be_joined, $field;
}
}
return @fields;
}