How can I store captures from a Perl regular expression into separate variables?

后端 未结 5 960
无人及你
无人及你 2020-12-15 06:01

I have a regex:

/abc(def)ghi(jkl)mno(pqr)/igs

How would I capture the results of each parentheses into 3 different variables, one for each

相关标签:
5条回答
  • 2020-12-15 06:44

    Your question is a bit ambiguous to me, but I think you want to do something like this:

    my (@first, @second, @third);
    while( my ($first, $second, $third) = $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
        push @first, $first;
        push @second, $second;
        push @third, $third;
    }
    
    0 讨论(0)
  • 2020-12-15 06:56

    An alternate way of doing it would look like ghostdog74's answer, but using an array that stores hash references:

    my @results;
    while( $string =~ /abc(def)ghi(jkl)mno(pqr)/igs) {
        my ($key1, $key2, $key3) = ($1, $2, $3);
        push @results, { 
            key1 => $key1,
            key2 => $key2,
            key3 => $key3,
        };
    }
    
    # do something with it
    
    foreach my $result (@results) {
        print "$result->{key1}, $result->{key2}, $result->{key3}\n";
    }
    

    with the main advantage here of using a single data structure, AND having a nice readable loop.

    0 讨论(0)
  • 2020-12-15 06:57

    You could have three different regex's each focusing on specific groups. Obviously, you would like to just assign different groups to different arrays in the regex, but I think your only option is to split the regex up.

    0 讨论(0)
  • 2020-12-15 06:59

    @OP, when parenthesis are captured, you can use the variables $1,$2....these are backreferences

    $string="zzzabcdefghijklmnopqrsssszzzabcdefghijklmnopqrssss";
    while ($string =~ /abc(def)ghi(jkl)mno(pqr)/isg) {
        print "$1 $2 $3\n";
    }
    

    output

    $ perl perl.pl
    def jkl pqr
    def jkl pqr
    
    0 讨论(0)
  • 2020-12-15 07:00

    Starting with 5.10, you can use named capture buffers as well:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    my %data;
    
    my $s = 'abcdefghijklmnopqr';
    
    if ($s =~ /abc (?<first>def) ghi (?<second>jkl) mno (?<third>pqr)/x ) {
        push @{ $data{$_} }, $+{$_} for keys %+;
    }
    
    use Data::Dumper;
    print Dumper \%data;
    

    Output:

    $VAR1 = {
              'first' => [
                           'def'
                         ],
              'second' => [
                            'jkl'
                          ],
              'third' => [
                           'pqr'
                         ]
            };

    For earlier versions, you can use the following which avoids having to add a line for each captured buffer:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    my $s = 'abcdefghijklmnopqr';
    
    my @arrays = \ my(@first, @second, @third);
    
    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
        push @{ $arrays[$_] }, $captured[$_] for 0 .. $#arrays;
    }
    
    use Data::Dumper;
    print Dumper @arrays;
    

    Output:

    $VAR1 = [
              'def'
            ];
    $VAR2 = [
              'jkl'
            ];
    $VAR3 = [
              'pqr'
            ];

    But I like keeping related data in a single data structure, so it is best to go back to using a hash. This does require an auxiliary array, however:

    my %data;
    my @keys = qw( first second third );
    
    if (my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
        push @{ $data{$keys[$_]} }, $captured[$_] for 0 .. $#keys;
    }
    

    Or, if the names of the variables really are first, second etc, or if the names of the buffers don't matter but only order does, you can use:

    my @data;
    if ( my @captured = $s =~ /abc (def) ghi (jkl) mno (pqr) /x ) {
        push @{ $data[$_] }, $captured[$_] for 0 .. $#captured;
    }
    
    0 讨论(0)
提交回复
热议问题