I\'m debugging some code and wondered if there is any practical difference between $1 and \\1 in Perl regex substitutions
For example:
my $package_n
First, you should always use warnings when developing:
#!/usr/bin/perl
use strict; use warnings;
my $package_name = "Some::Package::ButNotThis";
$package_name =~ s{^(\w+::\w+)}{\1};
print $package_name, "\n";
Output:
\1 better written as $1 at C:\Temp\x.pl line 7.
When you get a warning you do not understand, add diagnostics:
C:\Temp> perl -Mdiagnostics x.pl \1 better written as $1 at x.pl line 7 (#1) (W syntax) Outside of patterns, backreferences live on as variables. The use of backslashes is grandfathered on the right-hand side of a substitution, but stylistically it's better to use the variable form because other Perl programmers will expect it, and it works better if there are more than 9 backreferences.
Why does it work better when there are more than 9 backreferences? Here is an example:
#!/usr/bin/perl
use strict; use warnings;
my $t = (my $s = '0123456789');
my $r = join '', map { "($_)" } split //, $s;
$s =~ s/^$r\z/\10/;
$t =~ s/^$r\z/$10/;
print "[$s]\n";
print "[$t]\n";
Output:
C:\Temp> x ] [9]
If that does not clarify it, take a look at:
C:\Temp> x | xxd 0000000: 5b08 5d0d 0a5b 395d 0d0a [.]..[9]..
See also perlop:
The following escape sequences are available in constructs that interpolate and in transliterations …
\10
octal is 8
decimal. So, the replacement part contained the character code for BACKSPACE.
Incidentally, your code does not do what you want: That is, it will not print Some::Package
some package contrary to what your comment says because all you are doing is replacing Some::Package
with Some::Package
without touching ::ButNotThis
.
You can either do:
($package_name) = $package_name =~ m{^(\w+::\w+)};
or
$package_name =~ s{^(\w+::\w+)(?:::\w+)*\z}{$1};
From perldoc perlre:
The bracketing construct "( ... )" creates capture buffers. To refer to the current contents of a buffer later on, within the same pattern, use \1 for the first, \2 for the second, and so on. Outside the match use "$" instead of "\".
The \<digit>
notation works in certain circumstances outside the match. But it can potentially clash with octal escapes. This happens when the backslash is followed by more than 1 digits.