Why does Perl's tr/\n// get slower and slower as line lengths increase?

前端 未结 3 1286
慢半拍i
慢半拍i 2021-02-07 20:25

In perlfaq5, there\'s an answer for How do I count the number of lines in a file?. The current answer suggests a sysread and a tr/\\n//. I wanted to tr

3条回答
  •  陌清茗
    陌清茗 (楼主)
    2021-02-07 20:48

    I'm also seeing tr/// get relatively slower as the line lengths increase although the effect isn't as dramatic. These results are from ActivePerl 5.10.1 (32-bit) on Windows 7 x64. I also got "too few iterations for a reliable count" warnings at 100 so I bumped the iterations up to 500.

            VL: 4501.06    288
            LO: 749.25     29
            SH: 69.38      6
            VA: 104.66     55
                Rate VL-$count     VL-$.     VL-tr      VL-s     VL-wc
    VL-$count 2.82/s        --       -0%      -52%      -56%      -99%
    VL-$.     2.83/s        0%        --      -51%      -56%      -99%
    VL-tr     5.83/s      107%      106%        --      -10%      -99%
    VL-s      6.45/s      129%      128%       11%        --      -99%
    VL-wc      501/s    17655%    17602%     8490%     7656%        --
                Rate LO-$count     LO-$.      LO-s     LO-tr     LO-wc
    LO-$count 16.5/s        --       -1%      -50%      -51%      -97%
    LO-$.     16.8/s        1%        --      -50%      -51%      -97%
    LO-s      33.2/s      101%       98%        --       -3%      -94%
    LO-tr     34.1/s      106%      103%        3%        --      -94%
    LO-wc      583/s     3424%     3374%     1655%     1609%        --
                Rate SH-$count     SH-$.      SH-s     SH-tr     SH-wc
    SH-$count  120/s        --       -7%      -31%      -67%      -81%
    SH-$.      129/s        7%        --      -26%      -65%      -80%
    SH-s       174/s       45%       35%        --      -52%      -73%
    SH-tr      364/s      202%      182%      109%        --      -43%
    SH-wc      642/s      433%      397%      269%       76%        --
                Rate VA-$count     VA-$.      VA-s     VA-tr     VA-wc
    VA-$count 92.6/s        --       -5%      -36%      -63%      -79%
    VA-$.     97.4/s        5%        --      -33%      -61%      -78%
    VA-s       146/s       57%       50%        --      -42%      -67%
    VA-tr      252/s      172%      159%       73%        --      -43%
    VA-wc      439/s      374%      351%      201%       74%        --
    

    Edit: I did a revised benchmark to compare the rates for different line lengths. It clearly shows that tr/// starts out with a big advantage for short lines that rapidly disappears as the lines grow longer. As for why this happens, I can only speculate that tr/// is optimized for short strings.

    Line count rate comparison http://img69.imageshack.us/img69/6250/linecount.th.png

提交回复
热议问题