why are function calls in Perl loops so slow?

前端 未结 4 1584
无人共我
无人共我 2021-02-02 15:18

I was writing a file parser in Perl, so had to loop through file. File consists of fixed length records and I wanted to make a separate function that parses given record and cal

相关标签:
4条回答
  • 2021-02-02 15:20

    If your sub has no arguments and is a constant as in your example, you can get a major speed-up by using an empty prototype "()" in the sub declaration:

    sub get_string() {
        return sprintf(“%s\n”, ‘abc’);
    }
    

    However this is probably a special case for your example that do not match your real case. This is just to show you the dangers of benchmarks.

    You'll learn this tip and many others by reading perlsub.

    Here is a benchmark:

    use strict;
    use warnings;
    use Benchmark qw(cmpthese);
    
    sub just_return { return }
    sub get_string  { sprintf "%s\n", 'abc' }
    sub get_string_with_proto()  { sprintf "%s\n", 'abc' }
    
    my %methods = (
        direct      => sub { my $s = sprintf "%s\n", 'abc' },
        function    => sub { my $s = get_string()          },
        just_return => sub { my $s = just_return()         },
        function_with_proto => sub { my $s = get_string_with_proto() },
    );
    
    cmpthese(-2, \%methods);
    

    and its result:

                              Rate function just_return   direct function_with_proto
    function             1488987/s       --        -65%     -90%                -90%
    just_return          4285454/s     188%          --     -70%                -71%
    direct              14210565/s     854%        232%       --                 -5%
    function_with_proto 15018312/s     909%        250%       6%                  --
    
    0 讨论(0)
  • 2021-02-02 15:33

    Perl function calls are slow. It sucks because the very thing you want to be doing, decomposing your code into maintainable functions, is the very thing that will slow your program down. Why are they slow? Perl does a lot of things when it enters a subroutine, a result of it being extremely dynamic (ie. you can mess with a lot of things at run time). It has to get the code reference for that name, check that it is a code ref, set up a new lexical scratchpad (to store my variables), a new dynamic scope (to store local variables), set up @_ to name a few, check what context it was called in and pass along the return value. Attempts have been made to optimize this process, but they haven't paid out. See pp_entersub in pp_hot.c for the gory details.

    Also there was a bug in 5.10.0 slowing down functions. If you're using 5.10.0, upgrade.

    As a result, avoid calling functions over and over again in a long loop. Especially if its nested. Can you cache the results, perhaps using Memoize? Does the work have to be done inside the loop? Does it have to be done inside the inner-most loop? For example:

    for my $thing (@things) {
        for my $person (@persons) {
            print header($thing);
            print message_for($person);
        }
    }
    

    The call to header could be moved out of the @persons loop reducing the number of calls from @things * @persons to just @things.

    for my $thing (@things) {
        my $header = header($thing);
    
        for my $person (@persons) {
            print $header;
            print message_for($person);
        }
    }
    
    0 讨论(0)
  • 2021-02-02 15:35

    The perl optimizer is constant-folding the sprintf calls in your sample code.

    You can deparse it to see it happening:

    $ perl -MO=Deparse sample.pl
    foreach $_ (1 .. 10000000) {
        $a = &get_string();
    }
    sub get_string {
        return "abc\n";
    }
    foreach $_ (1 .. 10000000) {
        $a = "abc\n";
    }
    - syntax OK
    
    0 讨论(0)
  • 2021-02-02 15:44

    The issue you are raising does not have anything to do with loops. Both your A and B examples are the same in that regard. Rather, the issue is the difference between direct, in-line coding vs. calling the same code via a function.

    Function calls do involve an unavoidable overhead. I can't speak to the issue of whether and why this overhead is costlier in Perl relative to other languages, but I can provide an illustration of a better way to measure this sort of thing:

    use strict;
    use warnings;
    use Benchmark qw(cmpthese);
    
    sub just_return { return }
    sub get_string  { my $s = sprintf "%s\n", 'abc' }
    
    my %methods = (
        direct      => sub { my $s = sprintf "%s\n", 'abc' },
        function    => sub { my $s = get_string()          },
        just_return => sub { my $s = just_return()         },
    );
    
    cmpthese(-2, \%methods);
    

    Here's what I get on Perl v5.10.0 (MSWin32-x86-multi-thread). Very roughly, simply calling a function that does nothing is about as costly as directly running our sprintf code.

                     Rate    function just_return      direct
    function    1062833/s          --        -70%        -71%
    just_return 3566639/s        236%          --         -2%
    direct      3629492/s        241%          2%          --
    

    In general, if you need to optimize some Perl code for speed and you're trying to squeeze out every last drop of efficiency, direct coding is the way to go -- but that often comes with a price of less maintainability and readability. Before you get into the business of such micro-optimizing, however, you want to make sure that your underlying algorithm is solid and that you have a firm grasp on where the slow parts of your code actually reside. It's easy to waste a lot of effort working on the wrong thing.

    0 讨论(0)
提交回复
热议问题