How do I use a variable as a regex modifier in Perl?

纵饮孤独 提交于 2019-11-27 19:29:54

问题


I'm writing an abstraction function that will ask the user a given question and validate the answer based on a given regular expression. The question is repeated until the answer matches the validation regexp.

However, I also want the client to be able to specify whether the answer must match case-sensitively or not.

So something like this:

sub ask {
    my ($prompt, $validationRe, $caseSensitive) = @_;
    my $modifier = ($caseSensitive) ? "" : "i";
    my $ans;
    my $isValid;

    do {
        print $prompt;
        $ans = <>;
        chomp($ans);

        # What I want to do that doesn't work:
        # $isValid = $ans =~ /$validationRe/$modifier;

        # What I have to do:
        $isValid = ($caseSensitive) ?
            ($ans =~ /$validationRe/) :
            ($ans =~ /$validationRe/i);

    } while (!$isValid);

    return $ans;
}

Upshot: is there a way to dynamically specify a regular expression's modifiers?


回答1:


Upshot: is there a way to dynamically specify a regular expression's modifiers?

From perldoc perlre:

"(?adlupimsx-imsx)" "(?^alupimsx)" One or more embedded pattern-match modifiers, to be turned on (or turned off, if preceded by "-") for the remainder of the pattern or the remainder of the enclosing pattern group (if any).

This is particularly useful for dynamic patterns, such as those read in from a configuration file, taken from an argument, or specified in a table somewhere. Consider the case where some patterns want to be case-sensitive and some do not: The case-insensitive ones merely need to include "(?i)" at the front of the pattern.

Which gives you something along the lines of

$isValid = $ans =~ m/(?$modifier)$validationRe/;

Just be sure to take the appropriate security precautions when accepting user input in this way.




回答2:


You might also like the qr operator which quotes its STRING as a regular expression.

my $rex = qr/(?$mod)$pattern/;
$isValid = <STDIN> =~ $rex;



回答3:


Get rid of your $caseSensitive parameter, as it will be useless in many cases. Instead, users of that function can encode the necessary information directly in the $validationRe regex.

When you create a regex object like qr/foo/, then the pattern is at that point compiled into instructions for the regex engine. If you stringify a regex object, you'll get a string that when interpolated back into a regex will have exactly the same behaviour as the original regex object. Most importantly, this means that all flags provided or omitted from the regex object literal will be preserved and can't be overridden! This is by design, so that a regex object will continue to behave identical no matter what context it is used in.

That's a bit dry, so let's use an example. Here is a match function that tries to apply a couple similar regexes to a list of strings. Which one will match?

use strict;
use warnings;
use feature 'say';

# This sub takes a string to match on, a regex, and a case insensitive marker.
# The regex will be recompiled to anchor at the start and end of the string.
sub match {
    my ($str, $re, $i) = @_;
    return $str =~ /\A$re\z/i if $i;
    return $str =~ /\A$re\z/;
}

my @words = qw/foo FOO foO/;
my $real_regex = qr/foo/;
my $fake_regex = 'foo';

for my $re ($fake_regex, $real_regex) {
    for my $i (0, 1) {
        for my $word (@words) {
            my $match = 0+ match($word, $re, $i);
            my $output = qq("$word" =~ /$re/);
            $output .= "i" if $i;
            say "$output\t-->" . uc($match ? "match" : "fail");
        }
    }
}

Output:

"foo" =~ /foo/  -->MATCH
"FOO" =~ /foo/  -->FAIL
"foO" =~ /foo/  -->FAIL
"foo" =~ /foo/i -->MATCH
"FOO" =~ /foo/i -->MATCH
"foO" =~ /foo/i -->MATCH
"foo" =~ /(?^:foo)/     -->MATCH
"FOO" =~ /(?^:foo)/     -->FAIL
"foO" =~ /(?^:foo)/     -->FAIL
"foo" =~ /(?^:foo)/i    -->MATCH
"FOO" =~ /(?^:foo)/i    -->FAIL
"foO" =~ /(?^:foo)/i    -->FAIL

First, we should notice that the string representation of regex objects has this weird (?^:...) form. In a non-capturing group (?: ... ), modifiers for the pattern inside the group can be added or removed between the question mark and colon, while the ^ indicates the default set of flags.

Now when we look at the fake regex that's actually just a string being interpolated, we can see that the addition of the /i flag makes a difference as expected. But when we use a real regex object, it doesn't change anything: The outside /i cannot override the (?^: ... ) flags.

It is probably best to assume that all regexes already are regex objects and should not be interfered with. If you load the regex patterns from a file, you should require the regexes to use the (?: ... ) syntax to apply flages (e.g. (?^i:foo) as an equivalent to qr/foo/i). E.g. loading one regex per line from a file handle could look like:

my @regexes;
while (<$fh>) {
    chomp;
    push @regexes, qr/$_/;  # will die here on regex syntax errors
}



回答4:


You need to use the eval function. The below code will work:

sub ask {
    my ($prompt, $validationRe, $caseSensitive) = @_;
    my $modifier = ($caseSensitive) ? "" : "i";
    my $ans;
    my $isValid;

    do {
        print $prompt;
        $ans = <>;
        chomp($ans);

        # What I want to do that doesn't work:
        # $isValid = $ans =~ /$validationRe/$modifier;

        $isValid = eval "$ans =~ /$validationRe/$modifier";

        # What I have to do:
        #$isValid = ($caseSensitive) ?
        #    ($ans =~ /$validationRe/) :
        #    ($ans =~ /$validationRe/i);

    } while (!$isValid);

    return $ans;
}


来源:https://stackoverflow.com/questions/27576498/how-do-i-use-a-variable-as-a-regex-modifier-in-perl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!