Why/how is an additional variable needed in matching repeated arbitary character with capture groups?

前端未结

关注

 3  706

故里飘歌 2020-12-20 13:50

I\'m matching a sequence of a repeating arbitrary character, with a minimum length, using a perl6 regex.

After reading through https://docs.perl6.org/language/regex

3条回答

醉梦人生 (楼主)

2020-12-20 14:16

The reason you have to store the capture into something other than $0 is that every capturing () creates a new set of numbered captures.

So the $0 inside of ($0) can never refer to anything, because you didn't set $0 inside of the ().

(The named captures $ are also affected by this.)

The following has 3 separate $0 “variables”, and one $1 “variable”:

'aabbaabb' ~~ / ^ ( (.)$0 ((.)$0) ) $0 $ /

'aabbaabb' ~~ /
                ^

                # $0 = 'aabb'
                (

                  # $0 = 'a'
                  (.) $0

                  # $1 = 'bb'
                  (

                    # $0 = 'b'
                    (.) $0
                  )
                )

                $0

                $
              /

｢aabbaabb｣
 0 => ｢aabb｣
  0 => ｢a｣
  1 => ｢bb｣
   0 => ｢b｣

Basically the () in the regex DSL act a bit like {} in normal Perl6.

A fairly direct if simplified translation of the above regex to “regular” Perl6 code follows.
(Pay attention to the 3 lines with my $/ = [];)
(Also the / ^ / style comments refer to the regex code for ^ and such above)

given 'aabbaabb' {
    my $/ = [];      # give assignable storage for $0, $1 etc.
    my $pos = 0;     # position counter
    my $init = $pos; # initial position

    # / ^ /
    fail unless $pos == 0;

    # / ( /
    $0 = do {
        my $/ = [];
        my $init = $pos;

        # / (.) $0 /
        $0 = .substr($pos,1); # / (.) /
        $pos += $0.chars;
        fail unless .substr($pos,$0.chars) eq $0; # / $0 /
        $pos += $0.chars;

        # / ( /
        $1 = do {
            my $/ = [];
            my $init = $pos;

            # / (.) $0 /
            $0 = .substr($pos,1); # / (.) /
            $pos += $0.chars;
            fail unless .substr($pos,$0.chars) eq $0; # / $0 /
            $pos += $0.chars;

        # / ) /
            # the returned value (becomes $1 in outer scope)
           .substr($init, $pos - $init)
        }

    # / ) /
        # the returned value (becomes $0 in outer scope)
        .substr($init, $pos - $init)
    }

    # / $0 /
    fail unless .substr($pos,$0.chars) eq $0;
    $pos += $0.chars;

    # / $ /
    fail unless $pos = .chars;

    # the returned value
    .substr($init, $pos - $init)
}

TLDR;

Just remove the () surrounding ($c) / ($0).
(Assuming you didn't need the capture for something else.)

/((.) $0**2..*)/

perl6 -e '$_="bbaaaaawer"; /((.) $0**2..*)/ && put $0';

0 讨论(0)

查看其它3个回答