Perl regex substitution using external parameters

后端 未结 2 535
独厮守ぢ
独厮守ぢ 2021-01-05 15:18

Consider the following example:

my $text = \"some_strange_thing\";
$text =~ s/some_(\\w+)_thing/no_$1_stuff/;
print \"Result: $text\\n\";  

相关标签:
2条回答
  • 2021-01-05 15:26

    Essentially the same approach as the accepted solution, but I kept the initial lines the same as the problem statement, since I thought that might make it easier to fit into more situations:

    my $match = "some_(\\w+)_thing";
    my $repl = "no_\$1_stuff";
    
    my $qrmatch = qr($match);
    my $code = $repl;
    
    $code =~ s/([^"\\]*)(["\\])/$1\\$2/g;
    $code = qq["$code"];
    
    if (!defined($code)) {
      die "Couldn't find appropriate quote marks";
    }
    
    my $text = "some_strange_thing";
    $text =~ s/$qrmatch/$code/ee;
    print "Result: $text\n";
    

    Note that this works no matter what is in $repl, whereas the naive solution has issues if $repl contains a double quote character itself, or ends in a backslash.

    Also, assuming that you're going to run the three lines at the end (or something like it) in a loop, do make sure that you don't skip the qr line. It will make a huge performance difference if you skip the qr and just use s/$match/$code/ee.

    Also, even though it's not as trivial to get arbitrary code execution with this solution as it is with the accepted one, it wouldn't surprise me if it's still possible. In general, I'd avoid solutions based on s///ee if the $match or $repl come from untrusted users. (e.g., don't build a web service out of this)

    Doing this kind of replacement securely when $match and $repl are supplied by untrusted users should be asked as a different question if your use case includes that.

    0 讨论(0)
  • 2021-01-05 15:46

    Solution 1: String::Substitution

    Use String::Substitution package:

    use String::Substitution qw(gsub_modify);
    
    my $find = 'some_(\w+)_thing';
    my $repl = 'no_$1_stuff';
    my $text = "some_strange_thing";
    gsub_modify($text, $find, $repl);
    print $text,"\n";
    

    The replacement string only interpolates (term used loosely) numbered match vars (like $1 or ${12}). See "interpolate_match_vars" for more information.
    This module does not save or interpolate $& to avoid the "considerable performance penalty" (see perlvar).

    Solution 2: Data::Munge

    This is a solution mentioned by Grinnz in the comments below.

    The Data::Munge can be used the following way:

    use Data::Munge;
    
    my $find = qr/some_(\w+)_thing/;
    my $repl = 'no_$1_stuff';
    my $text = 'some_strange_thing';
    my $flags = 'g';
    print replace($text, $find, $repl, $flags);
    # => no_strange_stuff
    

    Solution 3: A quick'n'dirty way (if replacement won't contain double quotes and security is not considered)

    DISCLAIMER: I provide this solution as this approach can be found online, but its caveats are not explained. Do not use it in production.

    With this approach, you can't have a replacement string that includes a " double quotation mark and, since this is equivalent to handing whoever is writing the configuration file direct code access, it should not be exposed to Web users (as mentioned by Daniel Martin).

    You can use the following code:

    #!/usr/bin/perl
    my $match = qr"some_(\w+)_thing";
    my $repl = '"no_$1_stuff"';
    my $text = "some_strange_thing";
    $text =~ s/$match/$repl/ee;
    print "Result: $text\n";
    

    See IDEONE demo

    Result:

    Result: no_strange_stuff
    

    You have to

    1. Declare the replacement in '"..."' so as $1 could be later evaluated
    2. Use /ee to force the double evaluation of the variables in the replacement.

    A modifier available specifically to search and replace is the s///e evaluation modifier. s///e treats the replacement text as Perl code, rather than a double-quoted string. The value that the code returns is substituted for the matched substring. s///e is useful if you need to do a bit of computation in the process of replacing text.

    You can use qr to instantiate pattern for the regex (qr"some_(\w+)_thing").

    0 讨论(0)
提交回复
热议问题