Embedding evaluations in Perl regex

前端 未结 6 1836
猫巷女王i
猫巷女王i 2021-01-05 00:05

So i\'m writing a quick perl script that cleans up some HTML code and runs it through a html -> pdf program. I want to lose as little information as possible, so I\'d like t

相关标签:
6条回答
  • 2021-01-05 00:37

    I believe your problem is an unescaped /

    If it's not the problem, it certainly is a problem.

    Try this instead, note the \/80

    $file=~s/<textarea rows="(.+?)"(.*?)>(.*?)<\/textarea>/<textarea rows="(?{ length($3)\/80 })"$2>$3<\/textarea>/gis;
    

    The basic pattern for this code is:

    $file =~ s/some_search/some_replace/gis;
    

    The gis are options, which I'd have to look up. I think g = global, i = case insensitive, s = nothing comes to mind right now.

    0 讨论(0)
  • 2021-01-05 00:39

    First, you need to quote the / inside the expression in the replacement text (otherwise perl will see a s/// operator followed by the number 80 and so on). Or you can use a different delimiter; for complex substitutions, matching brackets are a good idea.

    Then you get to the main problem, which is that (?{...}) is only available in patterns. The replacement text is not a pattern, it's (almost) an ordinary string.

    Instead, there is the e modifier to the s/// operator, which lets you write a replacement expression rather than replacement string.

    $file =~ s(<textarea rows="(.+?)"(.*?)>(.*?)</textarea>)
              ("<textarea rows=\"" . (length($3)/80) . "\"$2>$3</textarea>")egis;
    
    0 讨论(0)
  • 2021-01-05 00:43

    The syntax you are using to embed code is only valid in the "match" portion of the substitution (the left hand side). To embed code in the right hand side (which is a normal Perl double quoted string), you can do this:

    $file =~ s{<textarea rows="(.+?)"(.*?)>(.*?)</textarea>}
              {<textarea rows="@{[ length($3)/80 ]}"$2>$3</textarea>}gis;
    

    This uses the Perl idiom of "some string @{[ embedded_perl_code() ]} more string".

    But if you are working with a very complex statement, it may be easier to put the substitution into "eval" mode, where it treats the replacement string as Perl code:

    $file =~ s{<textarea rows="(.+?)"(.*?)>(.*?)</textarea>}
              {'<textarea rows="' . (length($3)/80) . qq{"$2>$3</textarea>}}gise;
    

    Note that in both examples the regex is structured as s{}{}. This not only eliminates the need to escape the slashes, but also allows you to spread the expression over multiple lines for readability.

    0 讨论(0)
  • 2021-01-05 00:48

    The (?{...}) pattern is an experimental feature for executing code on the match side, but you want to execute code on the replacement side. Use the /e regular-expression switch for that:

    #! /usr/bin/perl
    
    use warnings;
    use strict;
    
    use POSIX qw/ ceil /;
    
    while (<DATA>) {
      s[<textarea rows="(.+?)"(.*?)>(.*?)</textarea>] {
        my $rows = ceil(length($3) / 80);
        qq[<textarea rows="$rows"$2>$3</textarea>];
      }egis;
      print;
    }
    
    __DATA__
    <textarea rows="123" bar="baz">howdy</textarea>
    

    Output:

    <textarea rows="1" bar="baz">howdy</textarea>
    0 讨论(0)
  • 2021-01-05 00:54

    Must this be done with regex? Parsing any markup language (or even CSV) with regex is fraught with error. If you can, try to utilize a standard library:

    http://search.cpan.org/dist/HTML-Parser/Parser.pm

    Otherwise you risk the revenge of Cthulu:

    http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html

    (Yes, the article leaves room for some simple string-manipulation, so I think your soul is safe, though. :-)

    0 讨论(0)
  • 2021-01-05 00:58

    As per http://perldoc.perl.org/perlrequick.html#Search-and-replace, this can be accomplished with the "evaluation modifier s///e", e.g., you gis must have an extra e in it.

    The evaluation modifier s///e wraps an eval{...} around the replacement string and the evaluated result is substituted for the matched substring. Some examples:

    # convert percentage to decimal
    $x = "A 39% hit rate";
    $x =~ s!(\d+)%!$1/100!e;       # $x contains "A 0.39 hit rate"
    
    0 讨论(0)
提交回复
热议问题