Perl split list on commas except when within brackets?

前端 未结 4 823
庸人自扰
庸人自扰 2021-01-02 11:09

I have a database with a number of fields containing comma separated values. I need to split these fields in Perl, which is straightforward enough except that some of the va

相关标签:
4条回答
  • 2021-01-02 11:44

    The solution you have chosen is superior, but to those who would say otherwise, regular expressions have a recursion element which will match nested parentheses. The following works fine

    use strict;
    use warnings;
    
    my $s = q{recycling, environmental science, interdisciplinary (e.g., consumerism, waste management, chemistry, toxicology, government policy, and ethics), consumer education};
    
    my @parts;
    
    push @parts, $1 while $s =~ /
    ((?:
      [^(),]+ |
      ( \(
        (?: [^()]+ | (?2) )*
      \) )
    )*)
    (?: ,\s* | $)
    /xg;
    
    
    print "$_\n" for @parts;
    

    even if the parentheses are nested further. No it's not pretty but it does work!

    0 讨论(0)
  • 2021-01-02 11:52

    Did anyone say you have to do it in one step? You could slice of values in a loop. Given your example you could use something like this.

    use strict;
    use warnings;
    use 5.010;
    
    my $s = q{recycling, environmental science, interdisciplinary (e.g., consumerism, waste management, chemistry, toxicology, government policy, and ethics), consumer education};
    
    my @parts;
    while(1){
    
            my ($elem, $rest) = $s =~ m/^((?:\w|\s)+)(?:,\s*([^\(]*.*))?$/;
            if (not $elem) {
                    say "second approach";
                    ($elem, $rest) = $s =~ m/^(?:((?:\w|\s)+\s*\([^\)]+\)),\s*(.*))$/;
            }
            $s = $rest;
            push @parts, $elem;
            last if not $s;
    
    }
    
    use Data::Dumper;
    print Dumper \@parts;
    
    0 讨论(0)
  • 2021-01-02 11:54

    Try this:

    my $s = q{recycling, environmental science, interdisciplinary (e.g., consumerism, waste management, chemistry, toxicology, government policy, and ethics), consumer education};
    
    my @parts = split /(?![^(]+\)), /, $s;
    
    0 讨论(0)
  • 2021-01-02 11:58

    Another approach that uses loops and split. I haven't tested the performance, but shouldn't this be faster than the look-ahead regexp solutions (as the length of $str increases)?

    my @elems = split ",", $str;
    my @answer;
    my @parens;
    while(scalar @elems) {
        push @answer,(shift @elems) while($elems[0] !~ /\(/);
        push @parens, (shift @elems) while($elems[0] !~ /\)/);
        push @answer, join ",", (@parens, shift @elems);
        @parens = ();
    }
    
    0 讨论(0)
提交回复
热议问题