Passing a regex substitution as a variable in Perl?

后端 未结 9 1829
青春惊慌失措
青春惊慌失措 2020-12-13 06:45

I need to pass a regex substitution as a variable:

sub proc {
    my $pattern = shift;
    my $txt = \"foo baz\";

    $txt =~ $pattern;
}

my $pattern = \'s         


        
相关标签:
9条回答
  • 2020-12-13 07:23
    sub proc {
        my($match, $subst) = @_;
        my $txt = "foo baz";
        $txt =~ s/$match/$subst/;
        print "$txt\n";
    }
    
    my $matcher = qr/foo/;
    my $sub_str = "bar";
    
    proc($matcher, $sub_str);
    

    This rather directly answers your question. You can do more - but when I used a qr// term instead of the $sub_str as a simple literal, then the expanded regex was substituted.

    I recently needed to create a parser (test parser) for statements with some peculiar (dialect of) SQL types, recognizing lines such as this, splitting it into three type names:

    input: datetime year to second,decimal(16,6), integer
    

    The script I used to demo this used quoted regexes.

    #!/bin/perl -w
    use strict;
    while (<>)
    {
        chomp;
        print "Read: <$_>\n";
        my($r1) = qr%^input\s*:\s*%i;
        if ($_ =~ $r1)
        {
            print "Found input:\n";
            s%$r1%%;
            print "Residue: <$_>\n";
            my($r3) = qr%(?:year|month|day|hour|minute|second|fraction(?:\([1-5]\))?)%;
            my($r2) = qr%
                            (?:\s*,?\s*)?   # Commas and spaces
                            (
                                (?:money|numeric|decimal)(?:\(\d+(?:,\d+)?\))?   |
                                int(?:eger)?  |
                                smallint      |
                                datetime\s+$r3\s+to\s+$r3
                            )
                        %ix;
            while ($_ =~ m/$r2/)
            {
                print "Got type: <$1>\n";
                s/$r2//;
            }
            print "Residue 2: <$_>\n";
        }
        else
        {
            print "No match:\n";
        }
        print "Next?\n";
    }
    

    We can argue about the use of names like $r1, etc. But it did the job...it was not, and is not, production code.

    0 讨论(0)
  • 2020-12-13 07:24

    Perhaps you might re-think your approach.

    You want to pass in to a function a regex substitution, probably because the function will be deriving the text to be operated upon from some other source (reading from a file, socket, etc.). But you're conflating regular expression with regular expression substitution.

    In the expression, s/foo/bar/, you actually have a regular expression ("/foo/") and a substitution ("bar") that should replace what is matched by the expression. In the approaches you've tried thus far, you ran into problems trying to use eval, mainly because of the likelihood of special characters in the expression that either interfere with eval or get interpolated (i.e. gobbled up) in the process of evaluation.

    So instead, try passing your routine two arguments: the expression and the substitution:

    sub apply_regex {
        my $regex = shift;
        my $subst = shift || ''; # No subst string will mean matches are "deleted"
    
        # some setup and processing happens...
    
        # time to make use of the regex that was passed in:
        while (defined($_ = <$some_filehandle>)) {
            s/$regex/$subst/g; # You can decide if you want to use /g etc.
        }
    
        # rest of processing...
    }
    

    This approach has an added benefit: if your regex pattern doesn't have any special characters in it, you can just pass it in directly:

    apply_regex('foo', 'bar');
    

    Or, if it does, you can use the qr// quoting-operator to create a regex object and pass that as the first parameter:

    apply_regex(qr{(foo|bar)}, 'baz');
    apply_regex(qr/[ab]+/, '(one or more of "a" or "b")');
    apply_regex(qr|\d+|); # Delete any sequences of digits
    

    Most of all, you really don't need eval or the use of code-references/closures for this task. That will only add complexity that may make debugging harder than it needs to be.

    Randy

    0 讨论(0)
  • 2020-12-13 07:26

    s/// is not a regex. Thus, you can't pass it as a regex.

    I don't like eval for this, it's very fragile, with a lot of bordercases.

    I think it's best to take an approach similar to the one Javascript takes: pass both a regex (in Perl, that is qr//) and a code reference for the substitution. For example, to pass parameters to get the same effect as

    s/(\w+)/\u\L$1/g;
    

    You can call

    replace($string, qr/(\w+)/, sub { "\u\L$1" }, 'g');
    

    Note that the 'g' modifier is not actually a flag for the regex (I think attaching it to the regex is a design mistake in Javascript), so I chose to pass it in a 3rd parameter.

    Once the API has been decided on, implementation can be done next:

    sub replace {
        my($string, $find, $replace, $global) = @_;
        unless($global) {
            $string =~ s($find){ $replace->() }e;
        } else {
            $string =~ s($find){ $replace->() }ge;
        }
        return $string;
    }
    

    Let's try it:

    print replace('content-TYPE', qr/(\w+)/, sub { "\u\L$1" }, 'g');
    

    Result:

    Content-Type

    That looks good to me.

    0 讨论(0)
  • 2020-12-13 07:28

    I need to pass a regex substitution as a variable

    Do you? Why not pass a code reference? Example:

    sub modify
    {
      my($text, $code) = @_;
      $code->($text);
      return $text;
    }
    
    my $new_text = modify('foo baz', sub { $_[0] =~ s/foo/bar/ });
    

    In general, when you want to pass "something that does something" to a subroutine ("a regex substitution" in the case of your question) the answer is to pass a reference to a piece of code. Higher Order Perl is a good book on the topic.

    0 讨论(0)
  • 2020-12-13 07:30

    Well, you can precompile the RE using the qr// operator. But you can't pass an operator (s///).

    $pattern = qr/foo/;
    
    print "match!\n" if $text =~ $pattern;
    

    But if you have to pass the substitution operator, you are down to passing either code or strings:

    proc('$text =~ s/foo/bar');
    
    sub proc { 
       my $code = shift;
    
       ...
    
       eval $code;
    }
    

    or, code:

    proc(sub {my $text = shift;  $text =~ s/foo/bar});
    
    sub proc {
       my $code = shift;
    
       ...
    
       $code->("some text");
    }
    
    0 讨论(0)
  • 2020-12-13 07:34

    eval "$txt =~ $pattern";
    This becomes
    eval "\"foo baz\" =~ s/foo/bar/"
    and substitutions don't work on literal strings.

    This would work:

    eval "\$txt =~ $pattern"
    but that's not very pleasing. eval is almost never the right solution.

    zigdon's solution can do anything, and Jonathan's solution is quite suitable if the replacement string is static. If you want something more structured than the first and more flexible than the second, I'd suggest a hybrid:

    sub proc {
        my $pattern = shift;
        my $code = shift;
        my $txt = "foo baz";
        $txt =~ s/$pattern/$code->()/e;
        print "$txt\n";
    }
    my $pattern = qr/foo/;
    proc($pattern, sub { "bar" });   # ==> bar baz
    proc($pattern, sub { "\U$&" });  # ==> FOO baz

    0 讨论(0)
提交回复
热议问题