What is the difference between using $1 vs \1 in Perl regex substitutions?

前端 未结 2 935
囚心锁ツ
囚心锁ツ 2021-01-11 13:57

I\'m debugging some code and wondered if there is any practical difference between $1 and \\1 in Perl regex substitutions

For example:

my $package_n         


        
相关标签:
2条回答
  • 2021-01-11 14:36

    First, you should always use warnings when developing:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    my $package_name = "Some::Package::ButNotThis";
    
    $package_name =~ s{^(\w+::\w+)}{\1};
    
    print $package_name, "\n";
    

    Output:

    \1 better written as $1 at C:\Temp\x.pl line 7.

    When you get a warning you do not understand, add diagnostics:

    C:\Temp> perl -Mdiagnostics x.pl
    \1 better written as $1 at x.pl line 7 (#1)
        (W syntax) Outside of patterns, backreferences live on as variables.
        The use of backslashes is grandfathered on the right-hand side of a
        substitution, but stylistically it's better to use the variable form
        because other Perl programmers will expect it, and it works better if
        there are more than 9 backreferences.

    Why does it work better when there are more than 9 backreferences? Here is an example:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    my $t = (my $s = '0123456789');
    my $r = join '', map { "($_)" } split //, $s;
    
    $s =~ s/^$r\z/\10/;
    $t =~ s/^$r\z/$10/;
    
    print "[$s]\n";
    print "[$t]\n";
    

    Output:

    C:\Temp> x
    ]
    [9]

    If that does not clarify it, take a look at:

    C:\Temp> x | xxd
    0000000: 5b08 5d0d 0a5b 395d 0d0a                 [.]..[9]..

    See also perlop:

    The following escape sequences are available in constructs that interpolate and in transliterations …

    \10 octal is 8 decimal. So, the replacement part contained the character code for BACKSPACE.

    NB

    Incidentally, your code does not do what you want: That is, it will not print Some::Package some package contrary to what your comment says because all you are doing is replacing Some::Package with Some::Package without touching ::ButNotThis.

    You can either do:

    ($package_name) = $package_name =~ m{^(\w+::\w+)};
    

    or

    $package_name =~ s{^(\w+::\w+)(?:::\w+)*\z}{$1};
    
    0 讨论(0)
  • 2021-01-11 14:45

    From perldoc perlre:

    The bracketing construct "( ... )" creates capture buffers. To refer to the current contents of a buffer later on, within the same pattern, use \1 for the first, \2 for the second, and so on. Outside the match use "$" instead of "\".

    The \<digit> notation works in certain circumstances outside the match. But it can potentially clash with octal escapes. This happens when the backslash is followed by more than 1 digits.

    0 讨论(0)
提交回复
热议问题