Prove that using a range operator in a loop does not use additional memory

后端 未结 3 1142
野的像风
野的像风 2021-01-02 11:08

The current documentation for the Range operator .. states that it will not burn up memory for counting loops:

... The range operator is useful for wr

相关标签:
3条回答
  • 2021-01-02 11:50

    Is there a way that one could go about proving that a counting loop does not waste memory?

    One way might be to monitor the process with top. For example:

    perl -E 'for( 1 .. 1_000_000_000_000 ) { say $_ }' > /tmp/blah & top -pid $!
    

    This shows that the memory usage remains fairly constant (and trivial) while the program runs, despite iterating a list of one trillion items.

    Does anyone know which versions of Perl had the memory issue and when it was fixed?

    I couldn't find the precise version when this was changed with a quick grep of the perldeltas. I will note that perlop mentions it:

    The range operator is useful for writing "foreach (1..10)" loops and for doing
    slice operations on arrays. In the current implementation, no temporary
    array is created when the range operator is used as the expression in
    "foreach" loops, but older versions of Perl might burn a lot of memory
    when you write something like this:
    
        for (1 .. 1_000_000) {
            # code
        }
    

    But it doesn't say when that optimization was introduced. In any case, it was a long time ago.

    0 讨论(0)
  • First of all, here's a list of different types of for loops and the optimizations that can be applied. All of these are present in every version of Perl from 5.6 to 5.20 (present) inclusive, and I believe it's comprehensive.

    • for (EXPR; EXPR; EXPR)
      ⇒ "C-style for loop", an augmented while loop.
    • for (EXPRX..EXPRY)
      ⇒ A range and nothing else is optimized into a counting loop.
    • for (@ARRAY)
      ⇒ An array and nothing else is accessed directly rather than being flattened.
    • for (reverse LIST)
      ⇒ None of the above optimizations apply, but the list is traversed in reverse rather than being reversed.
    • for (LIST)
      ⇒ In a generic foreach loop, the LIST expression is evaluated before the loop starts.

    When CONSTX..CONSTY is flattened (i.e. anywhere other than in for (CONSTX..CONSTY)), it is flattened at compile-time rather than run-time.


    Black box proof

    Baseline memory usage:

    $ perl -e'system(ps, ho, rss, 0+$$);'
     1540         # 1.5 MiB
    

    The general case flattens.

    $ perl -e'$y=2_000_000; for ((),1..$y) { system(ps, ho, rss, 0+$$); last }'
    80208         # 78 MiB
    

    Or worse. (It flattens into an array at compile-time in addition to the normal stack usage.)

    $ perl -e'for ((),1..2_000_000) { system(ps, ho, rss, 0+$$); last }'
    143224        # 140 MiB
    

    for (CONST..CONST) doesn't flatten.

    $ perl -e'for (1..2_000_000) { system(ps, ho, rss, 0+$$); last }'
     1540         # 1.5 MiB
    

    In fact, for (EXPR..EXPR) in general doesn't flatten.

    $ perl -e'$y=2_000_000; for (1..$y) { system(ps, ho, rss, 0+$$); last }'
     1540         # 1.5 MiB
    

    Even without tools, you could tell the the difference in compilation time.

    $ time perl -c -e'1 for 1..2_000_000'
    -e syntax OK
    
    real    0m0.010s
    user    0m0.004s
    sys     0m0.000s
    
    $ time perl -c -e'1 for (),1..2_000_000'
    -e syntax OK
    
    real    0m1.197s
    user    0m0.952s
    sys     0m0.232s
    

    White box proof

    The unoptimized case uses a range operator in list context. Full list in memory.

    $ perl -MO=Concise,-exec -e'$y=1_000_000; 1 for (),1..$y;'
    ...
    8  <|> range(other->9)[t3] lK/1                      <-- Range operator
    9      <#> gvsv[*y] s
    a      <1> flop lKM
               goto b
    i  <$> const[IV 1] s
    j  <1> flip[t4] lK/LINENUM
    b  <#> gv[*_] s
    c  <{> enteriter(next->d last->g redo->d) lK/8       <-- No S
    ...
    

    This is what a range flattened at compile-time looks like:

    $ perl -MO=Concise,-exec -e'1 for (),1..1_000_000;'
    ...
    4  <$> const[AV ] s                                  <-- Constant array
    5  <1> rv2av lKPM/1
    6  <#> gv[*_] s
    7  <{> enteriter(next->8 last->b redo->8) lK/8       <-- No S
    ...
    

    You can see that for (CONST..CONST) creates an enteriter with the "S" flag. On enteriter, the "S" flag means it's a counting loop.

    $ perl -MO=Concise,-exec -e'1 for 1..1_000_000;'
    ...
    4  <$> const[IV 1] s
    5  <$> const[IV 1000000] s
    6  <#> gv[*_] s
    7  <{> enteriter(next->8 last->b redo->8) lKS/8       <-- S
    ...
    

    Same for for (EXPR..EXPR) in general.

    $ perl -MO=Concise,-exec -e'$y=1_000_000; 1 for 1..$y;'
    ...
    8  <$> const[IV 1] s
    9  <#> gvsv[*y] s
    a  <#> gv[*_] s
    b  <{> enteriter(next->c last->f redo->c) lKS/8       <-- S
    ...
    

    Even for (@a) isn't flattened!

    $ perl -MO=Concise,-exec -e'1 for @a;'
    ...
    4  <#> gv[*a] s
    5  <1> rv2av[t2] sKRM/1
    6  <#> gv[*_] s
    7  <{> enteriter(next->8 last->b redo->8) lKS/8       <-- S
    ...
    

    Double-check

    $ perl -MO=Concise,-exec -e'1 for (),@a;'
    ...
    4  <#> gv[*a] s
    5  <1> rv2av[t2] lKM/1
    6  <#> gv[*_] s
    7  <{> enteriter(next->8 last->b redo->8) lK/8        <-- No S
    ...
    

    Looking up the code for the "S" flag will confirm all of this.

    • pp_enteriter (PL_op->op_flags & OPf_STACKED checks for "S")
    • pp_iter.
    0 讨论(0)
  • 2021-01-02 12:14

    The last time perlop says this

    Be aware that under the current implementation, a temporary array is created, so you'll burn a lot of memory if you write something like this:

    is in version 5.4_68.

    In 5.4_69 (released on 29 June 1998) it changes to being very close to the current version

    In the current implementation, no temporary array is created when the range operator is used as the expression in C loops, but older versions of Perl might burn a lot of memory when you write something like this:

    But I can't find the change mentioned in a perldelta anywhere!

    Anyway, we're talking about a sixteen-year-old fix,

    0 讨论(0)
提交回复
热议问题